How to learn ML4EO pipeline design

In the previous three blogs, I shared my experience in my first ML4EO project, first freelance project and first PhD project. The paths to the solutions are full of loops. I tried things, failed, and tried again. I learned about cloud cover over the equator, generalisation gaps and cases when a pre-trained model solved a complex problem.

In this blog, I wrap up the “How to design ML4EO pipelines” mini-series. The ball is in your court now: I’ll give tips for developing your ML4EO skills. The outline:

The challenge never goes away

You might have noticed that my challenges changed as I progressed through projects. In my master’s thesis, I struggled with technical aspects like libraries, packages, and downloading data. The freelance project was about having confidence in my ML4EO abilities. And finally, in my PhD, it wasn’t so much about the pipeline anymore but about finding the right problem to solve.

You can summarise my learning journey like this:

Learning to use the tools of the trade
Doubting my ability to use the tools.
Embracing the tools and moving on to other challenges.

As you gather experiences, challenges never go away–they just change. It’s that way for every skill. I went through the same thing with learning to use watercolour. First, I struggled with getting the water-to-paint ratio right or choosing the right paper. Then, I had to learn to mix clean colours. And so forth! As you learn, you will always find a new problem to solve. When it gets easy, it means you’ve stopped learning.

Learning faster

There are two ways to speed up the process of learning:

Make mistakes
Choose projects over courses

Make mistakes

In “The Right Kind of Wrong”, Amy C. Edmondson explains there are two different contexts related to failures: exploration and execution.

In execution contexts, what we need to do and achieve is clear. Any mistake can cost us. For instance, you don’t want to deploy a model that makes wrong predictions. Therefore, it pays off to be hesitant and think actions through to avoid mistakes.

However, in the exploration context, we’re navigating a new environment. We don’t yet know the right path forward. As a result, mistakes are not just inevitable; they inform us of which direction to take. We need to try a lot of different things to find the right path. Any hesitation only holds us back from finding the solution.

My freelance project was clearly in the exploration context: new data, new task, new collaborators. But, I treated it as an execution environment and tried to avoid mistakes. I wouldn't have gotten stuck had I recognised the failure of building detection as valuable information. In cases like this, making mistakes will help you learn faster.

Choose projects over courses

The first thing my driving instructor said when I got my driver’s license was: “You must drive alone right away the next time you drive.” And I heard many similar stories from my friends and family. The driving lessons teach you enough to use the car and know the traffic rules. You only really learn to drive once you start driving yourself.

It’s the same for ML4EO courses. Like many people, I used to look for online courses about satellite images or deep learning. Online courses can teach you the traffic rules or how to start the car. But the real magic happens when you work on projects. Projects rapidly expand your skills because applying new knowledge cements it in your brain. And the less experience you have with EO data, the more there is to learn.

Are online courses useless? No. I just think you shouldn’t expect to become an expert by following online courses. My uni courses combined lectures and (hard!) research projects to teach you theory and how to design a good research project in that field. This is how I learned about experimental design and how to interpret results. We did a project, wrote a report, and got feedback from teaching assistants. An automatically graded online test can replace that.

Collaboration fills knowledge gaps

Finally, a big takeaway from my projects is the value of interdisciplinary collaboration. Working with EO scientists brought a lot to my projects. It made them easier because they helped me with the data. And in many ways, it made the projects harder, too. I had to learn to explain my methods and intuitions without ML jargon. The other way around, I had to learn to understand EO communication.

Most of all, collaboration helps you to correctly integrate another field into your project. When you work with EO data in an ML-only team, you have no feedback on whether your problem is a relevant EO problem at all. Or even whether you’re treating the data correctly.

It’s true the other way around, too. When applying ML with an EO-only team, you have nothing but intuition to guide your ML design. The point about ML is that it’s not about intuition but about rules, definitions and best practices.

In other words, to become an ML4EO expert, you need to:

Acknowledge that you’re not an expert in everything
Get help from others to fill your knowledge gap

Conclusion

In this series, we covered a really practical part of ML4EO: analysis steps. You’d find it in the methods section and maybe in the results. But a paper has other sections, too. There is still so much more to learn: planning your research by reading papers, defining problem statements and evaluation. Stay tuned if you want to learn more about ML4EO!