Working with Complex Configuration, Abstracted Systems

ยท

7 min read

Let's take an example, a real world example ๐ŸŒ๐ŸŒŽ๐ŸŒ. I'm currently working with Backstage Software Templates. I even wrote a bit about it here I need to write a configuration or template, which creates a form - in a Web UI and then executes some steps. This involves running some Backstage specific stuff, like actions which do some action and you can write custom actions and then running React JSON Schema Form to render a form from JSON or in the case of Backstage, using YAML which is basically a superset of JSON

This has kind of become a thing, where you write configuration which becomes kind of like code. I write YAML configuration to say which form values / fields are required, and how to render the form based on some condition - for example, it's a form regarding deployment, so, I ask stuff like "Do you want to enable metrics scraping?", "Do you want to enable autoscaling?", "Do you want to enable health checks?" and then ask more information โ„น๏ธ if they say "Yes ๐Ÿ‘๐Ÿ™Œ" and not ask more information โ„น๏ธ if they say "No โŒโŽ". Like, more information like health check endpoint(s), metrics endpoint(s), autoscaling configuration / parameters like cpu based, memory based and in that itself is it exact value based, utilization (percentage) based, average value based etc

If I was writing โœ๏ธ๐Ÿ–‹๏ธโœ’๏ธ๐Ÿ–Š๏ธ code ๐Ÿง‘โ€๐Ÿ’ป๐Ÿ‘ฉโ€๐Ÿ’ป๐Ÿ‘จโ€๐Ÿ’ป for all this, it would be a hell of a lot more harder. But writing configuration is also no easier. I still have to write the configuration in YAML and then test it out by rendering it on a Web UI. I'm wondering ๐Ÿค”๐Ÿ’ญ how I can make it easier for me, especially the testing part, and it's also so hard to just write long YAML files, with if , then etc in YAML, and doing value checks in YAML. Basically, writing code, but minimal code, with YAML, lol ๐Ÿ˜‚๐Ÿ˜๐Ÿคฃ๐Ÿ˜น๐Ÿ˜†๐Ÿคช๐Ÿ˜œ

For some reason it seems very frustrating. Also, it seems like it can break easily. Maybe not. Or maybe it can. Only time will tell, once users start using it etc and more changes get incorporated into it, and as more changes get incorporated into it, there's more changes - more features - more complexity - more powerful features - etc - leading to possibly more bugs :) ๐Ÿ›๐Ÿœ๐Ÿž๐Ÿ•ท๏ธ๐Ÿชฒ

For now, I'm just doing it. But it seems very lame ๐Ÿ˜’. I guess I want to do more of other stuff, lol ๐Ÿ˜‚ Like Databases. But I think I might crib something similar when working with Databases or anything like that - where I'm interested at that moment or phase.

One thing to note is, more and more things are becoming configuration and there's more abstraction and abstraction becomes complicated when you want more out of the system as the abstraction layer starts to give more and more knobs when you want more knobs for complicated stuff - this ends up leaking out underlying system's features that the abstraction is trying to abstract. It's like this, let's take an example - Abstracted System A abstracts a System B which has 100 features with 1000 inputs / knobs. Abstracted System A probably provides all 100 features but with 100 inputs / knobs. But once the user starts asking for more and more input / knobs for configuration, to be able to tweak the features etc and realize the full power of the underlying System B, Abstracted System A will start leaking out the implementation details of System B, have so called "leaky abstractions" which will give direct or indirect access, basically abstracted access, to System B so that the user can realize the full power of System B if needed. At that point, when the user uses almost all the 100 or so features of System B and tries to work with all the 1000 or so inputs of System B to provide inputs / rotate knobs, then, the abstracted System A kinda becomes unnecessary or not so useful. Abstracted System A becomes a waste at this point, at least, almost a waste, unless it's helping the user use System B with ease in an otherwise probably harder to use System B when it's used directly without any abstractions. For example, compilers, transpilers are really helpful, right? Compilers transform high level programming language code to low/lower level programming language code, right? Similar for transpilers, just that they transform high level programming language code to another high level programming language code -> like Newer ECMAScript version code to Older ECMAScript version code, using a transpiler like https://babeljs.io/, so that older browser that understand only the older ECMAScript version code, can also work and load newer / modern web pages ๐Ÿ“ƒ ๐Ÿ“„ ๐Ÿ“‘ with ease, while making it easier for developers to easily write โœ๏ธโœ’๏ธ๐Ÿ–Š๏ธ๐Ÿ–‹๏ธโœ๏ธ modern code with modern features from a modern language, which is probably not even supported in the latest browsers yet. How cool ๐Ÿ˜Ž ๐Ÿ‘Œ ๐Ÿ†’ is that? While also being absurd and causing some level of complexity for developers to understand that they need to write code and then do some processing (compiling/transpiling etc) for it work on their browser :) or some environment

Anyways, that talks about how abstractions can be tricky and how configuration can become kind of complex, with too many features and too much power - for example Kubernetes YAML Configuration / YAML Manifests for any and all kinds of resources. All the complexity lies in the Kubernetes Operators / Controllers which look at the input YAML Configuration / YAML Manifest and work on it and work with it to do some complex stuff. And the YAML Configuration / YAML Manifest itself encapsulates lot of complex data and operation and code for control - to say what to do and very rarely a bit of how to do or what strategy or algorithm to use, of course the controller has complete details of "How" to do something and that's the power in declarative systems like Kubernetes where user just says "What" and need NOT say "How?" and everything is taken care of. Now the thing is, one has to be very careful about giving the right configuration or things might just break and cause issues. These days, there are configuration testing tools and systems. https://www.conftest.dev/ is one such example

A lot of people in the industry talk about how now configuration needs to be tested, rather than code. Since configuration kind of has become code, or the level at which users work at, given the layers of abstractions, and a single line or word can carry a lot of weight / power / features. This is like how people use Annotations in Java and how a single Annotation does a lot of magic ๐Ÿช„โœจ. It's just configuration basically. Somewhere we do define what happens when a piece of code uses some Annotation in the Java code. I think there are similar ideas in other languages too. Anyways. I'm looking forward to how to deal with these magical things - especially powerful configuration and how to test them, how to validate them, how to work with them, with ease, and with clear understanding of what's going on. Or else the user is left to read tons of documentation ๐Ÿ“ƒ๐Ÿ“„๐Ÿ“‘ to understand what a single configuration does - for example how databases and other complex systems have complex information โ„น๏ธ behind each knob / configuration / parameter / setting(s). Data systems like Databases, Big Data Systems, Data Streaming Systems have a lot of complex configuration / parameters / setting(s) / knobs for each feature. Sometimes more than one configuration that's all related / inter-related to the same or similar set of features / related set of features. Heck, there are even companies to help you with "fine-tuning" your database based on expertise in the database and also based on learning from your data ๐Ÿ“ˆ๐Ÿ“‰๐Ÿ“Š in the database. These companies exist since the problem exists, the problem being - there are so many parameters to manage and configure and you need to do it well and properly and perfectly and need to keep maintaining it based on changing requirements and also based on your data and everyone has different kinds of data and use the database in different ways and use different features etc. Ottertune was one such company, looks like now it's dead ๐Ÿ˜ต ๐Ÿ’€ โ˜ ๏ธ unfortunately. dbtune is another such company. Anyways, this is just to say that - configuration is going to get more stronger and stronger - more powerful and more complex too depending on how much power the user wants and how much power the user is given

I'll write more about configuration and abstract systems in future posts :D

ย