Photo Credit: Pixabay
Posted: May 16, 2024
I thought config files sucked, but I was wrong
I expected an easy ride when downloading deep learning code from Torchvision and Microsoft repositories. I was fresh into my PhD, full of noble programming goals and a slight obsession with software engineering. So, I downloaded the code and started going through it to pick out the bits I needed. Though the code was nice, clear and consistent, one problem was really, really pissing me off: Who the heck thought it was a good idea to pass all settings in a single dictionary called config
?
So, let me take a step back here. When I'm using other people's code, I spend a lot of the time tracing the origins of settings or arguments so I can remove the code that I do not need or find the exact numbers (think: learning rate, batch size, number of layers, etc.) used in experiments. This works very nicely in VSCode -- you can click through it. Well, that grand plan breaks down completely when people pass a config dictionary instead of separate variables for each setting. This dictionary could have any number of settings in there, and if you want to know when a particular setting was used, you have to carefully backtrack and read a lot of code. Going back far enough, you find the birthing place of these dictionaries: a folder with YAML files with configurations, a configuration for each network or experiment.
This all got me really annoyed. Why do they use this system, I asked myself angrily. Like a young bull, I was impatient. I just wanted to use the code ASAP; I didn't feel like reading a whole bunch of source code of which I would only use a part. Now, I have to do all this detective work to find what I need and ensure nothing is overwritten anywhere.
After a whole lot of grunting, sighing and angry clicking, I found what I needed and resolved to do it all much, much better. I was going to make nice, named arguments for each setting. Nice and traceable. Yes, well, that worked for the first version of the code. Not long after, I needed to make adjustments. Many adjustments: new functionality, new experiments with different settings. I had to pass a bazillion command line arguments or hardcode bash scripts every time. So, I started making dictionaries with settings for specific scenarios: Isn't this much easier than adding a new argument to every function?
Can you see where this is going? It didn't take much more of this until I caved and came to the natural conclusion that config files are, after all, really convenient and not stupid at all. But did I feel stupid? Yes. Was I relieved that I understood the big secret? Also yes. My point is: it's good to be critical and curious. Ask questions. Don't accept the status quo; That's how we make progress. But sometimes, that's also how we reinvent the wheel.
I think we learn things in different ways. We learn complementary colour palettes look nice because we can see how they harmonise in a painting (check out Van Gogh). We understand the rules of a board game because somebody explained them to us. Sometimes, we can only learn by experiencing for ourselves. Maybe if you're a mule like me, the last category is a bit bigger -- and that's okay!
You may also like...
After years of studying, I have started feeling pretty confident in my machine learning abilities. But my collaboration with SRON brings me face to face with my lack of experience with satellite imagery. Let’s wind back to your student days with me, and dive into the world of the pure joy of learning new things!
I used to dream about storage space, even before I had to worry about it. Those dreams were magical and fun, so I want to revisit them. Take a stroll down memory lane with me!
← Back to Blog