<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Karuppiah's Blog]]></title><description><![CDATA[Karuppiah's Blog]]></description><link>https://karuppiah.dev</link><generator>RSS for Node</generator><lastBuildDate>Thu, 16 Apr 2026 05:19:25 GMT</lastBuildDate><atom:link href="https://karuppiah.dev/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[inodes]]></title><description><![CDATA[inodes is a concept in Linux. Oh wait…no
Fun fact that I learned while experimenting on my macOS - I can see that the term and concept of inodes exists in the context of macOS too
Looks like it’s a “Unix” thing and Linux and Darwin, both are Unix bas...]]></description><link>https://karuppiah.dev/inodes</link><guid isPermaLink="true">https://karuppiah.dev/inodes</guid><category><![CDATA[Linux]]></category><category><![CDATA[linux for beginners]]></category><category><![CDATA[linux-basics]]></category><category><![CDATA[linux kernel]]></category><category><![CDATA[filesystem]]></category><category><![CDATA[#FileSystemManagement]]></category><category><![CDATA[filesystems]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Sun, 04 Jan 2026 16:29:50 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/cckf4TsHAuw/upload/eadcd840fbea1a01afa0944145beee2a.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>inodes is a concept in Linux. Oh wait…no</p>
<p>Fun fact that I learned while experimenting on my macOS - I can see that the term and concept of inodes exists in the context of macOS too</p>
<p>Looks like it’s a “Unix” thing and Linux and Darwin, both are Unix based systems / Unix-like systems</p>
<blockquote>
<p>I get to learn more about Unix next, haha. And find out if the term “Unix based systems” even makes sense or if it’s just Unix-like systems and that’s it</p>
</blockquote>
<p>inodes are basically the short form for “index nodes”</p>
<p>What are inodes? inodes are basically used to store data about files/directories - basically, metadata about files/directories - like file size, where the file contents are present on the disk and what are the access permissions and so on. Interestingly, it doesn’t have the name of the file/directory in it, and it of course does not have the content of the file/directory. It’s just metadata</p>
<p>Also, searching inodes online on Google also shows that inodes is a term in the context of file systems. I wonder if there are file systems that work across different kernels. Something to checkout</p>
<p>Anyways, it makes sense. inode is in fact a file system level thing. I learned that you set inode limits when creating file systems. So, it’s not exactly a kernel level thing, where it’s about Linux, Darwin etc. inode comes up in Unix-like file systems basically and is a data structure which stores metadata</p>
<p>Fun thing to do - checkout how many inodes you have in one of your file systems</p>
<p>Another fun thing to do - if you have a spare machine, try to exhaust all the inodes without exhausting the disk space. Yes. This is possible! Try and see what happens! You can also create a test file system and do this on the test file system :)</p>
<p>Use AI to help yourself :D I did too :D</p>
]]></content:encoded></item><item><title><![CDATA[Kubernetes Preemption Event]]></title><description><![CDATA[You can look for Kubernetes Preemption Events in your observability system assuming you are exporting your Kubernetes Events to some store like some time series DB or similar
In our case, we use Prometheus and we have an exporter for exporting the Ku...]]></description><link>https://karuppiah.dev/kubernetes-preemption-event</link><guid isPermaLink="true">https://karuppiah.dev/kubernetes-preemption-event</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[#kubernetes #container ]]></category><category><![CDATA[events]]></category><category><![CDATA[event-driven-architecture]]></category><category><![CDATA[kubernetes architecture]]></category><category><![CDATA[kubernetes debugging]]></category><category><![CDATA[kubernetes-pods]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Wed, 24 Dec 2025 11:11:16 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/d9ILr-dbEdg/upload/59b591c1d59951abd7e0588f6cd18663.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You can look for Kubernetes Preemption Events in your observability system assuming you are exporting your Kubernetes Events to some store like some time series DB or similar</p>
<p>In our case, we use Prometheus and we have an exporter for exporting the Kubernetes Events</p>
<p>With this, we are able to find all the preemption events using the PromQL</p>
<pre><code class="lang-plaintext">kube_event_exporter{reason="Preempted"}
</code></pre>
<p>If you look for this data for a long period - say 1 day, 1 week etc, you will notice any of the events that have happened. You can explore this data in Prometheus or use Grafana - and execute it as Range query over a period of time, instead of an Instant query</p>
<p>This event data also has fields like <code>source</code> which tells the scheduler’s name. For example, it could have a value like <code>/default-scheduler</code></p>
<p>Preemption events help you understand which high priority pods are kicking out which low priority pods :)</p>
]]></content:encoded></item><item><title><![CDATA[Speedscope: Performance Data Visualization]]></title><description><![CDATA[Recently, when I discovered the py-spy profiler took for Python, I also discovered Speedscope, which is visualization tool for visualizing performance data (performance profile etc). This is a flamegraph Visualization. I have something of this sort w...]]></description><link>https://karuppiah.dev/speedscope-performance-data-visualization</link><guid isPermaLink="true">https://karuppiah.dev/speedscope-performance-data-visualization</guid><category><![CDATA[Speedscope]]></category><category><![CDATA[Flamegraph ]]></category><category><![CDATA[software development]]></category><category><![CDATA[Software Engineering]]></category><category><![CDATA[performance]]></category><category><![CDATA[Performance Optimization]]></category><category><![CDATA[software]]></category><category><![CDATA[Software]]></category><category><![CDATA[Software Testing]]></category><category><![CDATA[scale]]></category><category><![CDATA[scalability]]></category><category><![CDATA[debugging]]></category><category><![CDATA[Devops]]></category><category><![CDATA[Devops articles]]></category><category><![CDATA[Developer]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Fri, 19 Dec 2025 19:21:56 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/-Vqn2WrfxTQ/upload/0dbea960649692ba74c69875919eca26.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Recently, when I discovered the py-spy profiler took for Python, I also discovered Speedscope, which is visualization tool for visualizing performance data (performance profile etc). This is a flamegraph Visualization. I have something of this sort when using Golang tools (proof)</p>
<p>Do check it out! :D</p>
<p><a target="_blank" href="https://speedscope.app">https://speedscope.app</a></p>
<p>It's open source! I Love Open Source Software projects :D Checkout the source code at</p>
<p>https://github.com/jlfwong/speedscope</p>
<p>I'm yet to use it extensively. I did play around with it using some sample data in the speedscope website. I'll be trying it out with actual data from some application and see how it looks like :D</p>
]]></content:encoded></item><item><title><![CDATA[Profilers!]]></title><description><![CDATA[So, today, we had an issue in one of our internal systems called API Tester. It was very slow. Only today it was slow, and the CPU usage was very high according to our monitoring systems, especially since today morning. Before noticing the CPU usage,...]]></description><link>https://karuppiah.dev/profilers</link><guid isPermaLink="true">https://karuppiah.dev/profilers</guid><category><![CDATA[profiler]]></category><category><![CDATA[gdb]]></category><category><![CDATA[profiling]]></category><category><![CDATA[profile]]></category><category><![CDATA[Python 3]]></category><category><![CDATA[Python]]></category><category><![CDATA[python beginner]]></category><category><![CDATA[debugging]]></category><category><![CDATA[debugging techniques]]></category><category><![CDATA[debugging tips]]></category><category><![CDATA[Developer]]></category><category><![CDATA[Devops]]></category><category><![CDATA[Devops articles]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Thu, 18 Dec 2025 13:49:08 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/d9ILr-dbEdg/upload/0fb09b9ee29a0e2b8d10c8e62920deb9.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>So, today, we had an issue in one of our internal systems called API Tester. It was very slow. Only today it was slow, and the CPU usage was very high according to our monitoring systems, especially since today morning. Before noticing the CPU usage, we thought it was some DB issue - increased the DB size (CPU and RAM), but that was not the problem and our SQL queries were also running fast. But the API calls to the system were slow, though SQL queries used by those API calls were fast. Finally we realized something else is causing the slowness and that it’s also causing the CPU spike</p>
<p>This API Tester, it’s an important internal system, that’s a single entry point to a lot of things in our internal developer platform. For example, it takes cares of / helps with running automation tests and showing the results / logs for it - through another system called Validator, and it also has an important feature - to issue access tokens that can be used to access all the internal systems that are protected by an auth wall. Apart from these, there are so many other features it has that I’m yet to learn about</p>
<p>This internal system is written in Python as a web application</p>
<p>I was blindly trying to debug this system to understand why CPU usage was high - using Google, Google’s AI answers and public forums (StackOverflow etc)</p>
<p>Based on my noob Googling, I went ahead and ran <code>gdb</code> blindly by installing <code>gdb</code> and many other companion stuff like stuff specific to <code>python</code> to debug <code>python</code> programs with <code>gdb</code>. This caused the Python program to halt (!!!!) :’) I thought I can do live debugging, but no, it halted the process (still gotta read about this) - as if there was a breakpoint and this caused issues for all the users of the system. Later, I stopped <code>gdb</code> after multiple separate runs and didn’t run it again when I realized it’s causing problems for the users who are already complaining about slowness</p>
<p>Finally, I found <code>py-spy</code> which seemed like a pretty interesting and fancy tool to debug Python programs</p>
<p>You can find the source code of it here - <a target="_blank" href="https://github.com/benfred/py-spy">https://github.com/benfred/py-spy</a></p>
<p>It helped with understanding what functions are taking up lot of CPU. There’s more to the tool than what basic stuff I used. I need to learn more - about <code>gdb</code> and <code>py-spy</code>. I have just tried to understand the ABCDs of profiling python programs</p>
<p>The idea from my Googling was - Linux system had high CPU usage - checked that using <code>top</code> and then found that a specific python process is using too much CPU - find the process ID and then check if there are many threads running under it and check which thread is using too much CPU and then check which python code invoked the thread and what it’s doing and why it’s slow</p>
<p><code>gdb</code> helped with finding threads inside the process and some more stuff which I didn’t understand. And <code>python3.9-dbg</code> (Debug Build of the Python Interpreter (version 3.9)) and <code>libpython3.9-dbg</code> (Debug Build of the Python Interpreter (version 3.9)) Ubuntu packages also helped, with <code>py-bt</code>, <code>py-bt-full</code> commands in <code>gdb</code></p>
<p>But we are still nowhere close to being able to debug this again if it happens again. But now we have some data and some guesses on what would have caused this problem. But yeah, next time, we’ll be at a better position to debug with <code>py-spy</code>. This time when the issue occurred, we had run <code>py-spy</code> very close to the end of the problem so didn’t get much data except a few things. Later the issue was also resolved by doing some restarts</p>
<p>I’ll write more about this when the problem happens again. In the meanwhile, I’ll probably create some sample slow running programs and debug them using <code>gdb</code>, and using <code>py-spy</code> too, to understand differences, pros and cons etc of using different tools for profiling, to understand CPU usage, RAM usage etc</p>
<p>Till then, see ya! :)</p>
<p>Apparently there are many such interesting profilers. Like, a profiler for Ruby, for PHP etc</p>
<p>References:</p>
<ul>
<li><p><a target="_blank" href="https://news.ycombinator.com/item?id=24837485">https://news.ycombinator.com/item?id=24837485</a></p>
</li>
<li><p><a target="_blank" href="https://jvns.ca/blog/2018/09/08/an-awesome-new-python-profiler--py-spy-/">https://jvns.ca/blog/2018/09/08/an-awesome-new-python-profiler--py-spy-/</a></p>
</li>
<li><p><a target="_blank" href="https://news.ycombinator.com/item?id=24836833">https://news.ycombinator.com/item?id=24836833</a></p>
</li>
<li><p><a target="_blank" href="https://github.com/rbspy">https://github.com/rbspy</a></p>
</li>
<li><p><a target="_blank" href="https://github.com/adsr/phpspy/">https://github.com/adsr/phpspy/</a></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Kubernetes Features Enabled]]></title><description><![CDATA[If you have Prometheus running and scraping metrics - You can find Kubernetes list of features enabled information for every feature using kubernetes_feature_enabled metric which gives build information
kubernetes_feature_enabled{}

The name of the f...]]></description><link>https://karuppiah.dev/kubernetes-features-enabled</link><guid isPermaLink="true">https://karuppiah.dev/kubernetes-features-enabled</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[features]]></category><category><![CDATA[  feature flags]]></category><category><![CDATA[Feature Management]]></category><category><![CDATA[feature]]></category><category><![CDATA[Kubernetes Features]]></category><category><![CDATA[kubernetes api]]></category><category><![CDATA[kube-apiserver]]></category><category><![CDATA[Kubernetes API server]]></category><category><![CDATA[#prometheus]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Thu, 18 Dec 2025 07:14:25 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/G9gHtroxnaI/upload/237c58035707bb8a17bfd9b0c290aafb.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you have Prometheus running and scraping metrics - You can find Kubernetes list of features enabled information for every feature using <code>kubernetes_feature_enabled</code> metric which gives build information</p>
<pre><code class="lang-plaintext">kubernetes_feature_enabled{}
</code></pre>
<p>The name of the feature is mentioned as part of <code>name</code> label</p>
]]></content:encoded></item><item><title><![CDATA[Almost Same Concept, Just Different Implemenations]]></title><description><![CDATA[It’s always interesting to connect the dots. Let me give you some tech examples where the “concept” is the same but the “implemenation” is different
For example, same programming language, but different runtimes and compilers for it. For example, Jav...]]></description><link>https://karuppiah.dev/almost-same-concept-just-different-implemenations</link><guid isPermaLink="true">https://karuppiah.dev/almost-same-concept-just-different-implemenations</guid><category><![CDATA[Implementation]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Tue, 02 Dec 2025 13:53:11 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/cckf4TsHAuw/upload/f9dbb42670841d8decc9d248b1765f69.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>It’s always interesting to connect the dots. Let me give you some tech examples where the “concept” is the same but the “implemenation” is different</p>
<p>For example, same programming language, but different runtimes and compilers for it. For example, JavaScript on the backend has multiple runtimes now - there’s <a target="_blank" href="https://nodejs.org/">Node.js</a> , there’s <a target="_blank" href="https://deno.com/">Deno</a> , there’s <a target="_blank" href="https://bun.com/">Bun</a>. For JavaScript on the frontend too there are mutliple runtimes now - there’s <a target="_blank" href="https://spidermonkey.dev/">SpiderMonkey</a> (popular in Mozilla browsers), there’s <a target="_blank" href="https://v8.dev/">V8</a> (popular in Chrome/Chromium browsers), there’s <a target="_blank" href="https://webkit.org/">WebKit</a> (popular in Safari browsers). Same / similar / mostly similar features, but different implementations. Mind you, these are very complex softwares we are talking about. No one has to rewrite them unless they simply want to or they are just so interested in it or maybe they have a very strong reason to and that they think they can do better or do something differently - maybe for different use cases etc</p>
<p>Another example I would like to bring up is - <a target="_blank" href="https://microsoft.github.io/language-server-protocol/">LSP - Language Server Protocol</a> - where someone got this idea to make it a standard on how someone integrates language plugins into editors and IDEs - they said, we will have one language server for one language - which serves all the needs of language plugins in different editors and IDEs. The user will have to run the language server and then any client that understands language server protocol and understands the language can use the language server and provide features in an IDE. So, people just have to build basic language server protocol client(s) for different editors and IDEs and that’s it - they don’t have to reimplement the logic of different features of a language server into each of the plugins - for each language and editor/IDE combination - that would be a lot of implemenations and also hard to maintain. Some examples of language server features could be - find definition of keyword, find references of keyword etc, where keyword can be a variable, function, method, type etc. So, they pulled the language features required for editors and IDEs out of process - out of the plugin’s process and used network and common protocol and pushed the features to a server. This is similar to how HTTP is so ubiquitous - HTTP server written in one language and HTTP clients written in so many languages. It’s useful when different languages are needed. For example, in the case of editors and IDEs, each editor and IDE will have it’s own way of doing things and having plugins and how plugins are built, installed and used - maybe each editor and IDE defines their own programming language for the plugin, unless they decide to get the plugin as a binary and define an API for interacting with the plugin. For example, Terraform Custom Providers, basically plugins that plug into Terraform, are all binaries and can be written in any language - so that’s a big win! Now, why do I bring up LSP? Well, similar to LSP concept, there’s MCP - Model Context Protocol, but it’s in a different context and hence a different implementation, where there are MCP servers to mediate between AI models and tools/services, similar to how LSP and Language servers are there to mediate between a editor or IDE plugin and a language. This is so that not everyone has to write a new custom code on both sides (the AI and the tool/service) to integrate them. MCP makes things easier and standardizes things :)</p>
<p>Another popular example of same concept different implemenations - package managers. There are so many package managers for JavaScript. Popular one was npm, then came yarn, then came more! Same for Golang - so many package managers / dependency management tools - there was glide and what not, now, Golang has an official one, but yeah, one can still choose to write and create their own package manager if they want. Now, python also has more than one package manager. It’s not just the official one which is <code>pip</code>, now there’s <code>uv</code> which is a different implementation but with the same features! Interesting, isn’t it? :D</p>
<p>And then there’s the API. HTTP API, or even custom TCP APIs, like Redis API, S3 API etc. And there are many different implementations. So many implementations for Redis Servers by implementing Redis features and also the Redis protocol. Same is true for S3 - there’s minio. Same is true for some softwares that want to replace other popular softwares - either drop in replacement, or be as close and similar to them as possible. For example, there’s Kafka and Redpanda. There’s Redis and there’s Valkey, the fork. There’s Redis and Dragonfly. There’s S3 and Minio</p>
<p>At this point, the conversation has generalized to - one need - but many providers for that need. But, you get the point. What’s the best thing about these? You can learn from all of them. You get to learn from all the different implementations - especially because in many cases the code is all open source, and even otherwise, you get to use them, try them out, read their documentation to understand what they do differently, what are they trying to do and why they made another thing that does the same thing</p>
<p>In a world that says “It’s already done”, “It’s an old idea” etc, you can learn to learn to still create new things that do the same old thing, but a bit differently or in the same way - just because you wanted to :)</p>
]]></content:encoded></item><item><title><![CDATA[.new Top Level Domain (TLD)]]></title><description><![CDATA[The .new TLD is a pretty recent addition among the many new TLDs we have now. The interesting thing about .new TLD is that it has some rules, like - any URL with .new TLD should lead to an action - a “new” action and a few more rules - detailed rules...]]></description><link>https://karuppiah.dev/new-top-level-domain-tld</link><guid isPermaLink="true">https://karuppiah.dev/new-top-level-domain-tld</guid><category><![CDATA[domain]]></category><category><![CDATA[dns]]></category><category><![CDATA[#TLD]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Sat, 29 Nov 2025 12:14:09 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/67l-QujB14w/upload/8f4f5138a2c9c69e38a4337be4d7c1cb.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The <code>.new</code> TLD is a pretty recent addition among the many new TLDs we have now. The interesting thing about <code>.new</code> TLD is that it has some rules, like - any URL with <code>.new</code> TLD should lead to an action - a “new” action and a few more <a target="_blank" href="https://get.new/#registration">rules</a> - <a target="_blank" href="https://www.registry.google/policies/registration/new/">detailed rules over here</a></p>
<p>You can find more details about the <code>.new</code> TLD online, like here - <a target="_blank" href="https://www.registry.google/tlds/new/">https://www.registry.google/tlds/new/</a></p>
<p>You can find out more about <code>.new</code> at these places too -</p>
<p><a target="_blank" href="https://whats.new/">https://whats.new/</a> - Website about what’s <code>.new</code> TLD</p>
<p><a target="_blank" href="https://whats.new/shortcuts/">https://whats.new/shortcuts/</a> - All the shortcuts out there using <code>.new</code></p>
<p>To get the <code>.new</code> TLD, checkout the domain registrars that you already know, or go from here -</p>
<p><a target="_blank" href="https://get.new/">https://get.new/</a></p>
<p>You can also read all the FAQs around <code>.new</code> TLD here - <a target="_blank" href="https://get.new/#faqs">https://get.new/#faqs</a></p>
<p>Do try out existing <code>.new</code> TLD domains! It’s fun!!</p>
<p>For example, just try docs.new ( <a target="_blank" href="https://docs.new">https://docs.new</a> or just <a target="_blank" href="http://docs.new">docs.new</a> ) for Google Docs</p>
<p>Try forms.new ( <a target="_blank" href="http://forms.new">http://forms.new</a> or just <a target="_blank" href="http://forms.new">forms.new</a> ) for Google Forms</p>
<p>You can also try singular version - doc.new ( <a target="_blank" href="https://doc.new">https://doc.new</a> or <a target="_blank" href="http://doc.new">doc.new</a> ) , form.new ( <a target="_blank" href="https://form.new">https://form.new</a> or <a target="_blank" href="http://form.new">form.new</a> )</p>
<p>And for Google Sheets and Google Slides -</p>
<p><a target="_blank" href="http://sheets.new">sheets.new</a> or <a target="_blank" href="http://sheet.new">sheet.new</a></p>
<p><a target="_blank" href="http://slides.new">slides.new</a> or <a target="_blank" href="http://slide.new">slide.new</a></p>
]]></content:encoded></item><item><title><![CDATA[Open Source: Part 2]]></title><description><![CDATA[How can you contribute to Open Source?
You can contribute so many things to Open Source! There’s Open Source Software, there’s Open Source Hardware (where hardware blueprint is open sourced), Open Source Art - yes!! Even Art and many other things are...]]></description><link>https://karuppiah.dev/open-source-part-2</link><guid isPermaLink="true">https://karuppiah.dev/open-source-part-2</guid><category><![CDATA[Open Source]]></category><category><![CDATA[open source]]></category><category><![CDATA[open source beginners guide]]></category><category><![CDATA[Open Source Community]]></category><category><![CDATA[OpenSource Journey]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Wed, 15 Oct 2025 06:49:17 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/npxXWgQ33ZQ/upload/1a729e0d22dea9671106860759e90d1c.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>How can you contribute to Open Source?</p>
<p>You can contribute so many things to Open Source! There’s Open Source Software, there’s Open Source Hardware (where hardware blueprint is open sourced), Open Source Art - yes!! Even Art and many other things are Open Sourced</p>
<p>Some things that you can Open Source as part of Software is -</p>
<ul>
<li><p>Software themselves - like Tools, Systems/Services, Frameworks, Libraries/Modules/Packages/Dependencies</p>
</li>
<li><p>Configuration. Examples - Configuration for a tool/system/service. Other Examples are</p>
<ul>
<li><p>Infrastructure as Code and Infrastructure as Configuration</p>
</li>
<li><p>Diagram as Code. Diagram as Configuration. For example <a target="_blank" href="https://excalidraw.com/">Excalidraw</a> files, <a target="_blank" href="https://mermaid.js.org">Mermaid</a> Files</p>
</li>
<li><p>Dashboard as Code. Dashboard as Configuration. For example Grafana Dashboards</p>
</li>
</ul>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Open Source: Part 1]]></title><description><![CDATA[Open Source refers to Source Code that’s out there in the Open, in the public, on the Internet, and accessible to everyone
Source Code here refers to any code - of any program, tool, system, library, framework or any software for that matter
You can ...]]></description><link>https://karuppiah.dev/open-source-part-1</link><guid isPermaLink="true">https://karuppiah.dev/open-source-part-1</guid><category><![CDATA[Open Source]]></category><category><![CDATA[open source]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Mon, 13 Oct 2025 08:01:36 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/Z8yWSsx8OWE/upload/f30c53b5ec636746abd3f810b909978b.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Open Source refers to Source Code that’s out there in the Open, in the public, on the Internet, and accessible to everyone</p>
<p>Source Code here refers to any code - of any program, tool, system, library, framework or any software for that matter</p>
<p>You can read more about Open Source online. Here are some keywords to look for and search for -</p>
<ul>
<li><p>FOSS - Free and Open Source Software</p>
</li>
<li><p>OSS - Open Source Software</p>
</li>
<li><p>OSI - Open Source Initiative</p>
</li>
<li><p>FSF - Free Software Foundation</p>
</li>
<li><p>FLOSS - Free/Libre and Open Source Software</p>
</li>
</ul>
<p>You can find source code “hosted” on many platforms / websites. Some popular ones are <a target="_blank" href="https://github.com/">GitHub</a>, <a target="_blank" href="https://about.gitlab.com/">GitLab</a> and <a target="_blank" href="https://bitbucket.com">BitBucket</a>, though there are a lot more out there!</p>
]]></content:encoded></item><item><title><![CDATA[Aggregation of Different Documentations]]></title><description><![CDATA[Something interesting that I found recently - an aggregation of different documentations
https://devdocs.io
It was built by Thibaut - https://github.com/Thibaut/devdocs . You can read about it in Thibaut Courouble’s Website over here - https://thibau...]]></description><link>https://karuppiah.dev/aggregation-of-different-documentations</link><guid isPermaLink="true">https://karuppiah.dev/aggregation-of-different-documentations</guid><category><![CDATA[Open Source]]></category><category><![CDATA[open source]]></category><category><![CDATA[documentation]]></category><category><![CDATA[documentation tool]]></category><category><![CDATA[aggregation]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Tue, 29 Jul 2025 11:57:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/0LaBRkmH4fM/upload/14e51d905c775f8b3af8ee18e42786d7.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Something interesting that I found recently - an aggregation of different documentations</p>
<p><a target="_blank" href="https://devdocs.io">https://devdocs.io</a></p>
<p>It was built by <a target="_blank" href="https://github.com/Thibaut">Thibaut</a> - <a target="_blank" href="https://github.com/Thibaut/devdocs">https://github.com/Thibaut/devdocs</a> . You can read about it in <a target="_blank" href="https://thibaut.me">Thibaut Courouble’s Website</a> over here - <a target="_blank" href="https://thibaut.me/projects/#devdocs">https://thibaut.me/projects/#devdocs</a></p>
<p>As of this writing - it’s operated by freeCodeCamp - <a target="_blank" href="https://github.com/freeCodeCamp/devdocs">https://github.com/freeCodeCamp/devdocs</a> . You can see how <a target="_blank" href="https://github.com/Thibaut/devdocs">https://github.com/Thibaut/devdocs</a> redirects to <a target="_blank" href="https://github.com/freeCodeCamp/devdocs">https://github.com/freeCodeCamp/devdocs</a></p>
<p>This is similar to hosting online <code>man</code> (manual) pages for multiple tools, commands etc out there, which is kind of an aggregation of manuals aka documentations</p>
<p>But yeah, I’m not sure how good they are at keeping the aggregated information up to date with the source of truth, more like, almost the source of truth, not exactly the source of truth, which is the official documentation. Sometimes even official documentations are wrong, so, which is the source of truth? The Software Source Code of the particular version of Software you are using :) Be it a tool, a system, a library, a frontend app, a backend server, a mobile app, a desktop app, an embedded app, anything. Source Code (correct version) is The Source Of Truth</p>
]]></content:encoded></item><item><title><![CDATA[Checking AWS Instance Info]]></title><description><![CDATA[https://ec2instances.info
OR https://instances.vantage.sh
GitHub Repository - https://github.com/vantage-sh/ec2instances.info]]></description><link>https://karuppiah.dev/checking-aws-instance-info</link><guid isPermaLink="true">https://karuppiah.dev/checking-aws-instance-info</guid><category><![CDATA[AWS]]></category><category><![CDATA[aws ec2]]></category><category><![CDATA[ec2]]></category><category><![CDATA[EC2 instance]]></category><category><![CDATA[ec2 instance types]]></category><category><![CDATA[AWS,EC2]]></category><category><![CDATA[EC2 Instances]]></category><category><![CDATA[Open Source]]></category><category><![CDATA[open source]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Tue, 29 Jul 2025 11:44:59 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/A9_IsUtjHm4/upload/16822da6e6f0667ed48148bceaae4ec5.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><a target="_blank" href="https://ec2instances.info">https://ec2instances.info</a></p>
<p>OR <a target="_blank" href="https://instances.vantage.sh">https://instances.vantage.sh</a></p>
<p>GitHub Repository - <a target="_blank" href="https://github.com/vantage-sh/ec2instances.info">https://github.com/vantage-sh/ec2instances.info</a></p>
]]></content:encoded></item><item><title><![CDATA[Finding Out Nginx Instances Running In An Environment]]></title><description><![CDATA[Assuming the environment is a container based environment. Other kinds of environments - Virtual Machines, Physical Machines, etc  
Blind Basic Checks for checking if nginx is running or not and how many nginx are running:

Check if the names of all ...]]></description><link>https://karuppiah.dev/finding-out-nginx-instances-running-in-an-environment</link><guid isPermaLink="true">https://karuppiah.dev/finding-out-nginx-instances-running-in-an-environment</guid><category><![CDATA[nginx]]></category><category><![CDATA[nginx ingress]]></category><category><![CDATA[NGINX Ingress Controller]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[containers]]></category><category><![CDATA[container]]></category><category><![CDATA[container orchestration]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Mon, 28 Jul 2025 17:17:15 GMT</pubDate><content:encoded><![CDATA[<p>Assuming the environment is a container based environment. Other kinds of environments - Virtual Machines, Physical Machines, etc  </p>
<p>Blind Basic Checks for checking if nginx is running or not and how many nginx are running:</p>
<ul>
<li><p>Check if the names of all the containers in all the pods have nginx in the name. Also check if the names of all the pods have nginx in the name</p>
<ul>
<li><p>Why? This is based on the idea that if nginx is running in a container / pod, its container or pod name probably has the name nginx in it</p>
</li>
<li><p>Problems with this approach</p>
<ul>
<li><p>It misses out containers where the container name may not mention nginx but maybe nginx is running in them. You never know</p>
</li>
<li><p>It misses out pods where the pod name may not mention nginx but maybe nginx is running in them. You never know</p>
</li>
<li><p>Maybe some containers may have nginx in the container name but may not be running nginx in them. You never know</p>
</li>
<li><p>Maybe some pods may have nginx in the pod name but may not be running nginx in them. You never know</p>
</li>
</ul>
</li>
</ul>
</li>
<li><p>Check if the image names of all the containers in all the pods have nginx in the name</p>
<ul>
<li><p>Why? This is based on the idea that if nginx is running in a container, its image name probably has the name nginx in it</p>
</li>
<li><p>Problems with this approach</p>
<ul>
<li><p>It misses out containers where the image name may not mention nginx but maybe nginx is running in them. You never know</p>
</li>
<li><p>Maybe some containers may have nginx in the image name but may not be running nginx in them. You never know</p>
</li>
</ul>
</li>
</ul>
</li>
<li><p>Check if the command names and/ entry points of all the container images (the build and runtime specification of the container image - for example Dockerfile ) of all containers in all the pods have nginx in the command name / entry point. You can check the source code of the command and/ entry points - for example, if it’s a script - say shell script (sh, bash etc) or Python script, or JavaScript, or Perl Script etc and see if it runs a nginx by checking for nginx in the command name of the command that’s being executed</p>
<ul>
<li><p>Why? This is based on the idea that if nginx is running in a container, its command name / entry point name probably has the name nginx in it</p>
</li>
<li><p>Problems with this approach</p>
<ul>
<li><p>It misses out containers where the command name / entry point may not mention nginx but maybe nginx is running in them. You never know</p>
</li>
<li><p>Maybe some containers may have nginx in the command name / entry point name but may not be running nginx in them. You never know</p>
</li>
</ul>
</li>
</ul>
</li>
<li><p>Check if the command names of all the containers (container specification) in all the pods have nginx in the name. This refers to - the container’s command field (not the args field)</p>
<ul>
<li><p>Why? This is based on the idea that if nginx is running in a container, its command name probably has the name nginx in it</p>
</li>
<li><p>Problems with this approach</p>
<ul>
<li><p>It misses out containers where the command name may not mention nginx but maybe nginx is running in them. You never know</p>
</li>
<li><p>Maybe some containers may have nginx in the command name but may not be running nginx in them. You never know</p>
</li>
</ul>
</li>
</ul>
</li>
<li><p>Check if nginx command works in all the containers in all the pods. Maybe run nginx --version</p>
<ul>
<li><p>Why? This is based on the idea that if nginx is running in a container, it has nginx command installed in it</p>
</li>
<li><p>Problems with this approach</p>
<ul>
<li><p>Maybe nginx is installed in the container (in the container image), but it’s not running</p>
</li>
<li><p>Maybe some other command (and not nginx server) is named as nginx</p>
</li>
<li><p>Maybe nginx command is renamed as something else which doesn’t refer to nginx in the name</p>
</li>
</ul>
</li>
</ul>
</li>
<li><p>Check if nginx process is running in all the containers in all the pods</p>
<ul>
<li><p>Why? This is based on the idea that if nginx is running in a container, then, it’s running in the container?</p>
</li>
<li><p>Problems with this approach</p>
<ul>
<li>It’s not easy to check if nginx is running in a container. For example, one thing that one could do is - check if the nginx process is running by checking ps aux command output and checking if there’s any process running the nginx command. But then, a command can be named anything else too and still be running nginx. You never know. ps aux just looks at the command name. Also, just because the commandname in px aux output has nginx in it or the name itself is nginx, doesn’t mean nginx server is running in it</li>
</ul>
</li>
</ul>
</li>
<li><p>Hit all container ports and check if the port serves HTTP and/ HTTPS assuming you are looking for an nginx serving web traffic and hit some non existent URL(s) and check if it returns a response with response body or response headers containing nginx in it</p>
<ul>
<li><p>Usually non-existent HTTP/HTTPS URL(s) are served by nginx with a 404 mentioning the response came from a nginx - usually in the response body</p>
</li>
<li><p>nginx by default sends almost any response with the response header Server: nginx/&lt;version&gt;. For example, Server: nginx/1.29.0</p>
</li>
<li><p>Problems with this approach</p>
<ul>
<li><p>It’s not necessary that our nginx is serving HTTP/HTTPS traffic - so, this violates any assumptions around that. So, if we are looking for an nginx server serving TCP traffic as a TCP proxy (for example), for a protocol other than Web Protocols (like HTTP, HTTPS), for example, PostgreSQL Proxy, MySQL Proxy or any Proxy for any custom protocol on top of TCP, or it can be a TCP server for something else.</p>
</li>
<li><p>Maybe the nginx server is configured in such a way that it will not return anything about the nginx server - for security reasons - for example, if one knows that we use nginx and the specific version of nginx we use, then, they can try to attack us by using the existing vulnerabilities for that version of nginx and cause harm to us</p>
</li>
</ul>
</li>
</ul>
</li>
<li><p>Maybe a mix of the above approaches can be used</p>
</li>
<li><p>Look at base container images (recursively) and understand what’s installed in it and how it’s installed - for example, nginx command being installed (through binaries, package managers, or compiled source code etc). Even then, it’s not easy and even then, nginx may not be running it and just because a command or software is named nginx , it may not refer to the nginx web server that we think of</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Nginx Upgrade]]></title><description><![CDATA[A small and simple plan for doing an nginx upgrade
This is in the context of using nginx inside Kubernetes as a container. So, you will see references to recommend using ingress and ingress controller, or Gateways, instead of standalone nginx instanc...]]></description><link>https://karuppiah.dev/nginx-upgrade</link><guid isPermaLink="true">https://karuppiah.dev/nginx-upgrade</guid><category><![CDATA[nginx]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[nginx ingress]]></category><category><![CDATA[NGINX Ingress Controller]]></category><category><![CDATA[ingress]]></category><category><![CDATA[ingress resources]]></category><category><![CDATA[Ingress Controllers]]></category><category><![CDATA[IngressController]]></category><category><![CDATA[ingress-nginx]]></category><category><![CDATA[upgrade]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Sat, 26 Jul 2025 13:16:38 GMT</pubDate><content:encoded><![CDATA[<p>A small and simple plan for doing an nginx upgrade</p>
<p>This is in the context of using nginx inside Kubernetes as a container. So, you will see references to recommend using ingress and ingress controller, or Gateways, instead of standalone nginx instances.</p>
<p><a target="_blank" href="https://docs.google.com/document/d/1htTY7D2O1boloo1uNsIxMnZC667tHix1ZrbVp03jHpM">Google Docs copy of this post</a></p>
<p><strong>Scope</strong></p>
<ul>
<li><p>Only standalone nginx instances are part of this upgrade</p>
</li>
<li><p>Upgrade with no changes in features or functionality from the perspective of the clients of the nginx. We don’t promise that any of the new features (in the new version) of the nginx will be used unless it comes with no changes with respect to changes from the perspective of the clients of the nginx and also assuming it is not too much of a change to make in the existing configuration (Infrastructure as Code, nginx configuration etc) and also assuming it’s easy to test. Or else we will do just a lift and shift of sorts as much as possible - where we just put the configuration file with no changes or minimal changes in the new version and deploy it. Anything else will be done separately and not as part of this upgrade  </p>
</li>
</ul>
<p><strong>Success Criteria</strong></p>
<ul>
<li><p>The new version of nginx is rolled out without any service disruption - to the relevant services</p>
<ul>
<li><p>Check Metrics, Events, Logs to look for any service disruption</p>
<ul>
<li><p>Look at Resource Usage: CPU, RAM, Network (traffic - both inbound and outbound), Disk</p>
</li>
<li><p>Look at HTTP errors - 4xx and 5xx. Especially 5xx</p>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p><strong>Plan</strong></p>
<ul>
<li><p>Decide on the version of nginx to use. Preferably the latest version - which is 1.29 as of this document writing (July 24th 2025) and it was released on 25th June 2025 9:39 AM IST. References: <a target="_blank" href="https://github.com/nginx/nginx">[1]</a>, <a target="_blank" href="https://github.com/nginx/nginx/releases">[2]</a>, <a target="_blank" href="https://github.com/nginx/nginx/releases/tag/release-1.29.0">[3]</a></p>
</li>
<li><p>Try out a blind upgrade - without checking differences between the old version and the new version</p>
<ul>
<li><p>Simply upgrade the nginx version to the new version - in the Dockerfile and build a new image using CI and publish it (push it to container image registry) and use it in the dev environment and test it out</p>
</li>
<li><p>Configure the new nginx version with the same configuration that’s used in the old version. So that it will point to the correct set of applications and have correct and expected configuration (unless there are breaking changes in the new version)</p>
</li>
<li><p>Look for errors in the logs, metrics, events</p>
</li>
<li><p>Look for errors at the platform level in Kubernetes - Kubernetes related events etc</p>
</li>
<li><p>Check resource usage: CPU, RAM, Network, Disk. Use metrics and dashboards for this. Setup monitoring and observability beforehand, setup MELTS (Metrics, Events, Logs, Traces, Spans) beforehand and setup alerts beforehand - all of these - before doing the upgrade</p>
</li>
<li><p>Try running application tests against the new nginx</p>
</li>
<li><p>Run load tests also, to understand performance. Do benchmarking also. All of this is to compare the performance between the old version and the new version. So, run the load tests and benchmarking tests for old version too, preferably before doing the upgrade and before running it for the new version</p>
</li>
</ul>
</li>
<li><p>Understand what features of the nginx are being used by the clients (human users, product engineering teams, services etc). Confirm the understanding with the clients (teams etc)</p>
<ul>
<li><p>For example, nginx has features to run plugins to extend nginx, using lua programming language source code etc. Check if any nginx plugins are being used</p>
</li>
<li><p>nginx has features to extend nginx using nginx modules, written in C programming language. Check if any nginx modules are being used</p>
</li>
<li><p>Understand any customisations done to nginx, any custom features used in nginx etc</p>
</li>
<li><p>Understand the compatibility between</p>
<ul>
<li><p>Nginx and any nginx plugins (written in Lua etc)</p>
</li>
<li><p>Nginx and any nginx modules (written in C etc)</p>
</li>
</ul>
</li>
</ul>
</li>
<li><p>Check compatibility between the versions of the different softwares involved at play over here</p>
<ul>
<li><p>Nginx version</p>
<ul>
<li><p>Versions of the Nginx plugins</p>
</li>
<li><p>Versions of the Nginx Modules</p>
</li>
</ul>
</li>
<li><p>Compatibility between nginx and the base image operating system version</p>
</li>
</ul>
</li>
<li><p>Check the changelog of all the versions from the old version to the new version</p>
<ul>
<li><p>Look for breaking changes</p>
<ul>
<li><p>Removal of features</p>
</li>
<li><p>Removal of configuration directives in the nginx configuration</p>
</li>
<li><p>Rename of configuration directives in the nginx configuration</p>
</li>
<li><p>Change in behaviour in the nginx in general</p>
</li>
<li><p>Change in behaviour for any of the nginx configuration directives</p>
</li>
</ul>
</li>
<li><p>Understand how the breaking changes affect us - based on our usage of the nginx, assuming we are not using all the features and configuration directives of nginx</p>
</li>
</ul>
</li>
<li><p>Ensure there’s a plan for</p>
<ul>
<li>Ability to be able to downgrade, smoothly, without any issues</li>
</ul>
</li>
</ul>
<p><strong>Future</strong></p>
<ul>
<li>Recommend product teams to use ingress and ingress controller. Or Gateway API and Gateways, which is the successor of ingress and ingress controller</li>
</ul>
<p>References:</p>
<p>[1]: <a target="_blank" href="https://github.com/nginx/nginx">https://github.com/nginx/nginx</a></p>
<p>[2]: <a target="_blank" href="https://github.com/nginx/nginx/releases">https://github.com/nginx/nginx/releases</a></p>
<p>[3]: <a target="_blank" href="https://github.com/nginx/nginx/releases/tag/release-1.29.0">https://github.com/nginx/nginx/releases/tag/release-1.29.0</a></p>
]]></content:encoded></item><item><title><![CDATA[Nginx Ingress Controller Upgrade]]></title><description><![CDATA[This is what I planned when I thought about this for my current company and team and our infrastructure. Funnily, I was told later that we had to upgrade only standalone nginx instances and not the nginx ingress controller. But I still kept this plan...]]></description><link>https://karuppiah.dev/nginx-ingress-controller-upgrade</link><guid isPermaLink="true">https://karuppiah.dev/nginx-ingress-controller-upgrade</guid><category><![CDATA[nginx]]></category><category><![CDATA[nginx ingress]]></category><category><![CDATA[upgrade]]></category><category><![CDATA[ingress]]></category><category><![CDATA[Ingress Controllers]]></category><category><![CDATA[IngressController]]></category><category><![CDATA[ingress resources]]></category><category><![CDATA[Kubernetes]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Sat, 26 Jul 2025 13:13:19 GMT</pubDate><content:encoded><![CDATA[<p>This is what I planned when I thought about this for my current company and team and our infrastructure. Funnily, I was told later that we had to upgrade only standalone nginx instances and not the nginx ingress controller. But I still kept this plan in case it comes in handy some day ;) :)</p>
<p><a target="_blank" href="https://docs.google.com/document/d/1aCHN9CJNJf1HWvrT-JSscNwQ1guxRxWd-2rgbUqPSno/edit?usp=sharing">Google Docs copy of this post</a></p>
<h1 id="heading-things-to-ensure">Things to ensure:</h1>
<ul>
<li><p>Understand if you want to upgrade the nginx too or not. Look at compatibility between the nginx ingress controller versions and the nginx versions</p>
</li>
<li><p>Anything and everything related to Nginx Upgrade in general, in the case of Standalone Nginx. Refer <a target="_blank" href="https://docs.google.com/document/u/0/d/1htTY7D2O1boloo1uNsIxMnZC667tHix1ZrbVp03jHpM">Nginx Upgrade</a> or the Nginx Upgrade post</p>
</li>
</ul>
<ul>
<li><p>Everything works as is and smoothly, with the latest version</p>
<ul>
<li>Metrics to understand everything is working as is or better, with no degradation in performance and NO errors. At least NO new errors. For any existing / old errors, while we have an older version - let’s ensure we fix those or get rid of them or ignore them. Preferably fix them or get rid of them (in case of any dead code related errors)</li>
</ul>
</li>
<li><p>Do Load Testing to check performance and do benchmarking. Do it on both the new version and the old version and check the difference in numbers (performance numbers) when it comes to performance. Ensure no issues like memory leaks etc or unnecessary huge increase in resource usage - CPU, RAM, Disk, and Network too</p>
</li>
<li><p>As minimal downtime as possible - preferably 0 downtime. Since nginx and the nginx ingress controllers are stateless services, 0 downtime should be possible. More like, MUST be possible. Through rolling update deployment strategy. And automation and observability helps here</p>
<ul>
<li>Metrics to understand any possible downtime - from the client’s perspective. The client can be a human, or a service too. So, get metrics from the client perspective too (usage perspective) and NOT just server perspective</li>
</ul>
</li>
<li><p>Smooth and Easy upgrade. It should be easy to do the upgrade. Simple and easy. Automation helps here</p>
</li>
<li><p>Everything automated. Any and all steps automated</p>
</li>
<li><p>No gaps or space for human errors to occur. Testing ensures this and automation helps here</p>
</li>
<li><p>Thorough testing before doing upgrade</p>
<ul>
<li>In testing environments and staging (which is close to production environment)</li>
</ul>
</li>
<li><p>Understand the criticality of each of the nginx ingress controller instances</p>
<ul>
<li><p>This will help us to rollout the upgrade for less critical ones first and check / test for any issues / errors / problems / bugs</p>
</li>
<li><p>This will help us form a plan of - in which order we do the upgrade</p>
</li>
</ul>
</li>
<li><p>Smooth and Easy downgrade. It should be easy to downgrade. Simple and easy. Automation helps here</p>
</li>
<li><p>Have all kinds of data related to the existing nginx and nginx ingress controllers and the new nginx and the new ingress controller version we are going to upgrade to</p>
<ul>
<li><p>Find Versions of all the existing nginx ingress controllers</p>
<ul>
<li>This <strong><em>probably</em></strong> involves two versions - one is the nginx ingress controller itself and then the nginx. Since nginx is also available as a standalone software. Understand any nuances here and understand if it’s just one version - the version of the nginx ingress controller or if different versions of nginx ingress controller can be used with different versions of nginx (understand compatibility between nginx ingress controller and nginx)</li>
</ul>
</li>
<li><p>Find Version of the nginx ingress controller we are going to upgrade to</p>
<ul>
<li><p>This <strong><em>probably</em></strong> involves two versions - one is the nginx ingress controller itself and then the nginx. Since nginx is also available as a standalone software. Understand any nuances here and understand if it’s just one version - the version of the nginx ingress controller or if different versions of nginx ingress controller can be used with different versions of nginx. So, check this clearly (understand compatibility between nginx ingress controller and nginx)</p>
</li>
<li><p>Choose this carefully and meticulously. We need a new or latest version but also a stable version and something that has some amount of long term support</p>
</li>
<li><p>Ensure the stability of the new version by checking any data around it - issues raised, including security issues, performance issues, correctness issues.</p>
</li>
</ul>
</li>
<li><p>Check compatibility between the versions of the different softwares involved at play over here</p>
<ul>
<li><p>Nginx Ingress Controller version</p>
</li>
<li><p>Kubernetes version (API version, control plane version etc)</p>
</li>
<li><p>Nginx version</p>
<ul>
<li><p>Versions of the Nginx plugins</p>
</li>
<li><p>Versions of the Nginx Modules</p>
</li>
</ul>
</li>
<li><p>Nginx Ingress Controller Image’s Base image Operating System version</p>
<ul>
<li><p>Compatibility between nginx and the base image operating system version</p>
</li>
<li><p>Compatibility between ingress controller and the base image operating system version</p>
</li>
</ul>
</li>
<li><p>Helm Chart version</p>
</li>
</ul>
</li>
<li><p>Difference between the older versions (the ones we have, the existing versions) and the new version we want to upgrade to</p>
<ul>
<li>This involves both the nginx ingress controller version and the nginx version</li>
</ul>
</li>
<li><p>Also, check the difference between the older versions (the ones we have, the existing versions) and the newer versions (a few recent ones if not all) apart from just the one we want to upgrade to, just to understand what kind of changes are happening - to be aware</p>
</li>
<li><p>Data around the nginx ingress controllers</p>
<ul>
<li><p>How many ingresses (ingress configurations) are there?</p>
</li>
<li><p>How many nginx ingress controllers are there?</p>
</li>
<li><p>Who uses the nginx ingress controllers?</p>
</li>
<li><p>Who manages the nginx ingress controllers?</p>
</li>
<li><p>How many ingresses are being managed by each of the nginx ingress controllers?</p>
</li>
<li><p>Are there any other ingress controllers other than nginx? Are they managing any ingresses?</p>
</li>
<li><p>Understand how the ingress is connected to the ingress controller using configuration</p>
<ul>
<li><p>This can be through annotations - which mentions ingress class name</p>
</li>
<li><p>This can be through ingress class name field in the spec</p>
</li>
<li><p>[EXTRA] Try to change the ingresses too if they use annotations to mention ingress class name and don’t have any ingress class name field in the spec. This is important for the future because depending on annotations to mention critical information is bad</p>
</li>
</ul>
</li>
<li><p>Check how many default ingress controllers are there</p>
<ul>
<li>[EXTRA] Ensure there’s only one default ingress controller. Or, there are no default ingress controllers. Maybe no default ingress controllers is a good idea - that way, users are required to put a particular ingress class name or else their ingress will never be used. But for this to happen, we need to fill up any missing fields in existing ingresses which map to the default since they don’t have any annotations to mention the ingress class name, or any ingress class name spec field</li>
</ul>
</li>
<li><p>Understand what features of the nginx ingress controller we use - this refers to both the ingress controller and the nginx</p>
<ul>
<li><p>For example, nginx has features to run plugins to extend nginx, using lua programming language source code etc. Check if any nginx plugins are being used</p>
</li>
<li><p>nginx has features to extend nginx using nginx modules, written in C programming language. Check if any nginx modules are being used</p>
</li>
<li><p>Understand any customisations done to nginx, any custom features used in nginx etc</p>
</li>
</ul>
</li>
<li><p>Understand what features of the ingress resource we use</p>
</li>
<li><p>Understand the ingress resource version we use. Version as in - API Group and it’s Version, that is, the  apiVersion field in Kubernetes resources. For example <a target="_blank" href="http://networking.k8s.io">networking.k8s.io</a> API Group and v1 Version, under which we have Ingress or ingress resource, that is, kind field in Kubernetes resources</p>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<ul>
<li><p>Metrics around the nginx ingress controllers</p>
<ul>
<li><p>How much traffic is currently coming</p>
<ul>
<li>Requests per second</li>
</ul>
</li>
<li><p>Network usage of the nginx - inbound and outbound traffic</p>
</li>
<li><p>Resource Usage apart from Network Usage - CPU and RAM. And any Disk Usage</p>
<ul>
<li>Ensure that there’s NO disk usage (internal/container/ephemeral/volatile or non-volatile/external-disk etc) proportional to the traffic - as much as possible. Ideally the disk usage should stay almost constant, given it’s a stateless service. For logs etc, it should log to standard output and that should be enough. No log files etc</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>Modern Trends</p>
<ul>
<li><p>Understand the modern trends. For the future</p>
<ul>
<li><p>For example, people are moving away from Ingress resource and Ingress Controllers to Gateway API</p>
<ul>
<li>[EXTRA] We can see if we need Gateway API or if we want it, and accordingly see if we can / want to migrate the Ingress resources to Gateway API resources</li>
</ul>
</li>
</ul>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Debugging Features and Extensions In VSCode: With Cursor editor and MCP Tools debugging as Example]]></title><description><![CDATA[Note: Before following this blog - Please do check the version of your VS Code or Cursor to check whether you have a newer version or older version or the exact version compared to the version of the VS Code or Cursor that I’m using

VS Code Version ...]]></description><link>https://karuppiah.dev/debugging-features-and-extensions-in-vscode-with-cursor-editor-and-mcp-tools-debugging-as-example</link><guid isPermaLink="true">https://karuppiah.dev/debugging-features-and-extensions-in-vscode-with-cursor-editor-and-mcp-tools-debugging-as-example</guid><category><![CDATA[mcp]]></category><category><![CDATA[mcp server]]></category><category><![CDATA[MCP Client]]></category><category><![CDATA[Model Context Protocol]]></category><category><![CDATA[Model Context Protocol (MCP)]]></category><category><![CDATA[AI]]></category><category><![CDATA[#ai-tools]]></category><category><![CDATA[cursor]]></category><category><![CDATA[cursor IDE]]></category><category><![CDATA[cursor ai]]></category><category><![CDATA[debugging]]></category><category><![CDATA[debugging techniques]]></category><category><![CDATA[debugging tips]]></category><category><![CDATA[debug]]></category><category><![CDATA[Visual Studio Code]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Fri, 27 Jun 2025 12:30:13 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/d9ILr-dbEdg/upload/2f5907289fc04295bbbfd38b082e55a3.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>Note: Before following this blog - Please do check the version of your VS Code or Cursor to check whether you have a newer version or older version or the exact version compared to the version of the VS Code or Cursor that I’m using</p>
</blockquote>
<p>VS Code Version that I’m using while writing this blog -</p>
<pre><code class="lang-plaintext">Version: 1.101.2
Commit: 2901c5ac6db8a986a5666c3af51ff804d05af0d4
Date: 2025-06-24T20:27:15.391Z
Electron: 35.5.1
ElectronBuildId: 11727614
Chromium: 134.0.6998.205
Node.js: 22.15.1
V8: 13.4.114.21-electron.0
OS: Darwin arm64 24.5.0
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751019232999/19e0ca6d-5c14-47b0-8b74-93cc6f2f52b9.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751019207237/5d2fe1cb-481b-4316-8003-ad27d3520878.png" alt class="image--center mx-auto" /></p>
<p>This is a generic method but it works for specific things too. For example, I used this method in <a target="_blank" href="https://www.cursor.com/">Cursor</a> editor to debug MCP issues</p>
<blockquote>
<p>MCP - Model Context Protocol - <a target="_blank" href="https://modelcontextprotocol.io/">https://modelcontextprotocol.io/</a></p>
</blockquote>
<p>This uses the <code>Output</code> feature of VSCode, which is also what Cursor seems to be made of</p>
<p>You can generally use the <code>Output</code> feature by multiple ways - keybinding / keyboard shortcuts (depending on your machine and VSCode setup), Menu Bar, Command Palette</p>
<p>In Mac OS, I can do this by using the <code>Help</code> menu and searching for <code>Output</code> in the Search Bar 🔎🔍, which shows me that <code>Output</code> is there under <code>View</code> menu. Using Command Palette, I can choose the <code>View: Toggle Output</code> feature</p>
<blockquote>
<p>‼️ ⚠️ ⛔️ ☣️ ☢️ ⚠ Be careful to not choose <code>View: Clear Output</code> while the Output panel is open, or else it will clear the output logs</p>
</blockquote>
<p>An example is given below</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751017638866/c5c3c08d-faa4-4795-8812-f2ef5790d5f4.png" alt class="image--center mx-auto" /></p>
<p>The default view will look like this -</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751017672593/f77b2177-a50f-4a15-95ef-d243937dead9.png" alt class="image--center mx-auto" /></p>
<blockquote>
<p>‼️ ⚠️ ⛔️ ☣️ ☢️ ⚠ Be careful to not choose <code>View: Clear Output</code> while the Output panel is open, or else it will clear the output logs</p>
</blockquote>
<p>In the <code>Output</code> view, you can see the output (logs) of different things - I <strong>guess</strong> it’s usually features of VS Code or extensions of VS Code, but not for all features and extensions from what I can see. I <strong>guess</strong> this can be due to - either</p>
<ul>
<li><p>Feature not having output (logs)</p>
</li>
<li><p>Feature not working</p>
</li>
<li><p>Feature output (logs) alone not working</p>
</li>
<li><p>Feature not enabled</p>
</li>
<li><p>Extension not having output (logs)</p>
</li>
<li><p>Extension not working</p>
</li>
<li><p>Extension output (logs) alone not working</p>
</li>
<li><p>Extension not enabled</p>
</li>
</ul>
<p>Now, let’s look at the output (logs) of something, like <code>Git</code></p>
<p>I <strong>guess,</strong> by default, you will see the output of <code>Tasks</code> if you haven’t chosen anything previously</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751018110727/1e4b45ee-c4eb-4f81-af49-aeba5605a556.png" alt class="image--center mx-auto" /></p>
<p>You can see a drop down menu in the <code>Output</code> view near the filter input box, which says <code>Tasks</code> by default I <strong>guess</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751018281614/24a1d924-6cf1-4cea-86a9-9bc4d61c4a22.png" alt class="image--center mx-auto" /></p>
<blockquote>
<p>‼️ ⚠️ ⛔️ ☣️ ☢️ ⚠ Be careful to not click the button shown below while the Output panel is open, or else it will clear the output logs. Unless you want to clear logs, don’t click it</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751018219620/62f94de3-f39e-4075-b89d-961e49922c4c.png" alt class="image--center mx-auto" /></p>
</blockquote>
<p>You can choose from the list of things from the drop down</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751018369031/ce895918-2f1a-463c-aba9-8f1b4d97177b.png" alt class="image--center mx-auto" /></p>
<p>Here’s a zoomed in view to give an example</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751018386637/e9135a74-a2ac-4126-baeb-6d55178be182.png" alt class="image--center mx-auto" /></p>
<p>You can usually see Language Server Logs of different Languages depending on if the Language Extension has been installed and enabled and if the Language Extension uses a Language Server and also shows the logs of the Language Server</p>
<blockquote>
<p>ℹ️ℹ A Language Server is something that implements the Language Server Protocol - <a target="_blank" href="https://en.wikipedia.org/wiki/Language_Server_Protocol">https://en.wikipedia.org/wiki/Language_Server_Protocol</a></p>
</blockquote>
<p>Let’s click on <code>Git</code> to see some sample output and also checkout the output for other things</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751018572455/8e783081-d3f0-4d8f-9553-ae476c2a5764.png" alt class="image--center mx-auto" /></p>
<p>Other examples:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751018550992/3ca15709-653d-4d80-b7cc-862b0a8c5e41.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751018636723/c5879b26-acd8-453a-aa5f-1007bd0b3b5f.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751018647835/a78ea29e-a419-4840-b3d9-6a3e5677bab3.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751018660118/740a03cc-404f-469a-a23a-477474b6d6e3.png" alt class="image--center mx-auto" /></p>
<p>Now, for debugging why the MCP client (in my Cursor editor) didn’t work, I used the <code>Output</code> tab and also my local terminal</p>
<p>So, this is how my Cursor editor settings looks like</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751018839067/9dc218a5-d1b1-450d-b538-a635be009602.png" alt class="image--center mx-auto" /></p>
<p>Notice that the chosen thing in the drop down is <code>MCP Logs</code></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751018903492/2e2a08e9-a75c-4b69-a8f9-a7c4a6e35058.png" alt class="image--center mx-auto" /></p>
<p>Other things on the list are -</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751018921391/12dcec20-40c6-4f8e-8ce1-de7ee2c069d4.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751018935587/3d684016-b97e-42b5-96b2-7138376a8a5f.png" alt class="image--center mx-auto" /></p>
<p>Let me show you how my <code>mcp.json</code> looks like. You can see the <code>mcp.json</code> file when you click the <code>New MCP Server</code> button under <code>MCP Tools</code> under <code>Tools &amp; Integrations</code></p>
<p>Before showing more details, I just wanna share my Cursor editor version so that you can double check it before reading further and following it and doing things in your Cursor editor</p>
<p>My Cursor editor version is -</p>
<pre><code class="lang-plaintext">Version: 1.1.6
VSCode Version: 1.96.2
Commit: 5b19bac7a947f54e4caa3eb7e4c5fbf832389850
Date: 2025-06-25T02:14:24.784Z
Electron: 34.5.1
Chromium: 132.0.6834.210
Node.js: 20.19.0
V8: 13.2.152.41-electron.0
OS: Darwin arm64 24.5.0
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751019030768/5ac613bb-7198-4114-a6e9-311082778574.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751019359945/957c1fde-ca79-430d-b43d-9b3c4d273721.png" alt class="image--center mx-auto" /></p>
<p>Now, let’s go back to my <code>mcp.json</code></p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"mcpServers"</span>: {
    <span class="hljs-attr">"prometheus_stage"</span>: {
      <span class="hljs-attr">"url"</span>: <span class="hljs-string">"https://some-staging-server.com/prometheus/mcp"</span>,
      <span class="hljs-attr">"headers"</span>: {
        <span class="hljs-attr">"Content-Type"</span>: <span class="hljs-string">"application/json"</span>,
        <span class="hljs-attr">"accessToken"</span>: <span class="hljs-string">"put-a-valid-token-over-here"</span>
      }
    },
    <span class="hljs-attr">"prometheus_nightly"</span>: {
      <span class="hljs-attr">"url"</span>: <span class="hljs-string">"https://some-nightly-server.com/prometheus/mcp"</span>,
      <span class="hljs-attr">"headers"</span>: {
       <span class="hljs-attr">"Content-Type"</span>: <span class="hljs-string">"application/json"</span>,
       <span class="hljs-attr">"accessToken"</span>: <span class="hljs-string">"put-a-valid-token-over-here"</span>
      }
    },
    <span class="hljs-attr">"prometheus_dev"</span>: {
      <span class="hljs-attr">"url"</span>: <span class="hljs-string">"https://some-dev-server.com/prometheus/mcp"</span>,
      <span class="hljs-attr">"headers"</span>: {
        <span class="hljs-attr">"Content-Type"</span>: <span class="hljs-string">"application/json"</span>,
        <span class="hljs-attr">"accessToken"</span>: <span class="hljs-string">"put-a-valid-token-over-here"</span>
      }
    }
  }
}
</code></pre>
<blockquote>
<p>Note that <code>some-staging-server.com</code> , <code>some-nightly-server.com</code> , <code>some-dev-server.com</code> are all example names (DNS / Domain names). I changed the DNS name in the settings to remove sensitive, confidential and private information. If you literally use any of them (the DNS names) in your MCP tools settings, it may not give the same logs mostly because it doesn’t point to anything - as in - there are no DNS records for it.</p>
<p>For example, if you literally use <code>some-staging-server.com</code> in your MCP tools settings, it will not be able to resolve the name (DNS / Domain name) to anything unless you point it (using DNS records) to something in your machine or in the DNS you use or in the DNS in your network. So, you will get a different error if you literally use <code>some-staging-server.com</code> in your MCP tools settings</p>
<p>It may give the same logs if you use those and/ define those - create DNS records for the domain names.</p>
<p>As of this writing, <code>some-staging-server.com</code> , <code>some-nightly-server.com</code> , <code>some-dev-server.com</code> are not actual domain names that resolves to anything since they have no DNS records at this point of time</p>
</blockquote>
<p>In my case, this <code>mcp.json</code> file is present in my home directory under the <code>.cursor</code> directory. You can check this out here -</p>
<blockquote>
<p>Note that I blurred the DNS name in the settings to remove sensitive, confidential and private information</p>
</blockquote>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751019649764/cea95fa5-d7d9-4c39-9f06-516bd484ace3.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751019684142/89bfb98c-d10f-48c0-ac9a-2752bf6c4868.png" alt class="image--center mx-auto" /></p>
<p>Now, I know there’s a problem in my MCP Client since I don’t see a “green” mark in the <code>MCP Tools</code> section for the MCP tools I’m using, after configuring. It shows either “yellow” mark - which seems to mean - Loading, or it shows a “red” mark - which seems to mean Failed / Error / Failure</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751019716500/606d53a9-fc04-45e8-9c0d-b5b0ea09d5d3.png" alt class="image--center mx-auto" /></p>
<p>Now, to debug this, I saw the logs -</p>
<pre><code class="lang-plaintext">2025-06-27 12:11:47.509 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 12:11:47.509 [error] user-prometheus_stage: No server info found
2025-06-27 12:11:47.509 [info] user-prometheus_nightly: Handling CreateClient action
2025-06-27 12:11:47.509 [info] user-prometheus_nightly: Creating streamableHttp transport
2025-06-27 12:11:47.509 [info] user-prometheus_nightly: Connecting to streamableHttp server
2025-06-27 12:11:47.509 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 12:11:47.509 [error] user-prometheus_nightly: No server info found
2025-06-27 12:11:47.509 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 12:11:47.509 [error] user-prometheus_dev: No server info found
2025-06-27 12:11:52.074 [error] user-prometheus_nightly: Client error for command HTTP 401 trying to load well-known OAuth metadata
2025-06-27 12:11:52.075 [info] user-prometheus_nightly: Client closed for command
2025-06-27 12:11:52.076 [error] user-prometheus_nightly: Error connecting to streamableHttp server, falling back to SSE: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 12:11:52.076 [error] user-prometheus_nightly: Error connecting to streamableHttp server, falling back to SSE: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 12:11:52.076 [info] user-prometheus_nightly: Connecting to SSE server
2025-06-27 12:11:55.780 [error] user-prometheus_nightly: Client error for command HTTP 401 trying to load well-known OAuth metadata
2025-06-27 12:11:55.781 [error] user-prometheus_nightly: Error connecting to SSE server after fallback: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 12:11:55.781 [info] user-prometheus_nightly: Client closed for command
2025-06-27 12:11:55.783 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 12:11:55.783 [error] user-prometheus_stage: No server info found
2025-06-27 12:11:55.783 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 12:11:55.783 [error] user-prometheus_stage: No server info found
2025-06-27 12:11:55.796 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 12:11:55.796 [error] user-prometheus_nightly: No server info found
2025-06-27 12:11:55.797 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 12:11:55.797 [error] user-prometheus_nightly: No server info found
2025-06-27 12:11:55.800 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 12:11:55.800 [error] user-prometheus_dev: No server info found
2025-06-27 12:11:55.801 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 12:11:55.801 [error] user-prometheus_dev: No server info found
2025-06-27 12:13:51.869 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 12:13:51.870 [error] user-prometheus_stage: No server info found
2025-06-27 12:13:51.870 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 12:13:51.870 [error] user-prometheus_nightly: No server info found
2025-06-27 12:13:51.871 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 12:13:51.871 [error] user-prometheus_dev: No server info found
2025-06-27 12:13:53.813 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 12:13:53.813 [error] user-prometheus_stage: No server info found
2025-06-27 12:13:53.813 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 12:13:53.813 [error] user-prometheus_nightly: No server info found
2025-06-27 12:13:53.817 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 12:13:53.817 [error] user-prometheus_dev: No server info found
2025-06-27 15:16:11.749 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 15:16:11.749 [error] user-prometheus_stage: No server info found
2025-06-27 15:16:11.751 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 15:16:11.751 [error] user-prometheus_nightly: No server info found
2025-06-27 15:16:11.754 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 15:16:11.754 [error] user-prometheus_dev: No server info found
2025-06-27 15:36:09.516 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 15:36:09.516 [error] user-prometheus_stage: No server info found
2025-06-27 15:36:09.516 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 15:36:09.516 [error] user-prometheus_nightly: No server info found
2025-06-27 15:36:09.517 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 15:36:09.517 [error] user-prometheus_dev: No server info found
2025-06-27 15:36:28.347 [info] user-prometheus_nightly: Handling CreateClient action
2025-06-27 15:36:28.347 [info] user-prometheus_nightly: Creating streamableHttp transport
2025-06-27 15:36:28.347 [info] user-prometheus_nightly: Connecting to streamableHttp server
2025-06-27 15:36:28.359 [info] user-prometheus_nightly: Handling CreateClient action
2025-06-27 15:36:28.359 [info] user-prometheus_nightly: Creating streamableHttp transport
2025-06-27 15:36:28.359 [info] user-prometheus_nightly: Connecting to streamableHttp server
2025-06-27 15:36:32.369 [error] user-prometheus_nightly: Client error for command HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:32.371 [info] user-prometheus_nightly: Client closed for command
2025-06-27 15:36:32.373 [error] user-prometheus_nightly: Error connecting to streamableHttp server, falling back to SSE: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:32.373 [error] user-prometheus_nightly: Error connecting to streamableHttp server, falling back to SSE: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:32.373 [info] user-prometheus_nightly: Connecting to SSE server
2025-06-27 15:36:32.937 [error] user-prometheus_nightly: Client error for command HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:32.938 [info] user-prometheus_nightly: Client closed for command
2025-06-27 15:36:32.938 [error] user-prometheus_nightly: Error connecting to streamableHttp server, falling back to SSE: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:32.938 [error] user-prometheus_nightly: Error connecting to streamableHttp server, falling back to SSE: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:32.938 [info] user-prometheus_nightly: Connecting to SSE server
2025-06-27 15:36:35.205 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 15:36:35.205 [error] user-prometheus_stage: No server info found
2025-06-27 15:36:35.207 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 15:36:35.207 [error] user-prometheus_nightly: No server info found
2025-06-27 15:36:35.209 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 15:36:35.209 [error] user-prometheus_dev: No server info found
2025-06-27 15:36:36.369 [error] user-prometheus_nightly: Client error for command HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:36.369 [error] user-prometheus_nightly: Error connecting to SSE server after fallback: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:36.369 [info] user-prometheus_nightly: Client closed for command
2025-06-27 15:36:36.374 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 15:36:36.374 [error] user-prometheus_stage: No server info found
2025-06-27 15:36:36.374 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 15:36:36.374 [error] user-prometheus_stage: No server info found
2025-06-27 15:36:36.374 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 15:36:36.375 [error] user-prometheus_nightly: No server info found
2025-06-27 15:36:36.375 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 15:36:36.375 [error] user-prometheus_nightly: No server info found
2025-06-27 15:36:36.375 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 15:36:36.375 [error] user-prometheus_dev: No server info found
2025-06-27 15:36:36.377 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 15:36:36.377 [error] user-prometheus_dev: No server info found
2025-06-27 15:36:36.919 [error] user-prometheus_nightly: Client error for command HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:36.919 [error] user-prometheus_nightly: Error connecting to SSE server after fallback: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:36.919 [info] user-prometheus_nightly: Client closed for command
2025-06-27 15:36:36.921 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 15:36:36.921 [error] user-prometheus_stage: No server info found
2025-06-27 15:36:36.922 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 15:36:36.922 [error] user-prometheus_stage: No server info found
2025-06-27 15:36:36.922 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 15:36:36.922 [error] user-prometheus_nightly: No server info found
2025-06-27 15:36:36.923 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 15:36:36.923 [error] user-prometheus_nightly: No server info found
2025-06-27 15:36:36.925 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 15:36:36.926 [error] user-prometheus_dev: No server info found
2025-06-27 15:36:36.926 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 15:36:36.926 [error] user-prometheus_dev: No server info found
2025-06-27 15:36:45.569 [info] user-prometheus_dev: Handling CreateClient action
2025-06-27 15:36:45.569 [info] user-prometheus_dev: Creating streamableHttp transport
2025-06-27 15:36:45.569 [info] user-prometheus_dev: Connecting to streamableHttp server
2025-06-27 15:36:46.247 [error] user-prometheus_dev: Client error for command fetch failed
2025-06-27 15:36:46.247 [info] user-prometheus_dev: Client closed for command
2025-06-27 15:36:46.248 [error] user-prometheus_dev: Error connecting to streamableHttp server, falling back to SSE: fetch failed
2025-06-27 15:36:46.248 [error] user-prometheus_dev: Error connecting to streamableHttp server, falling back to SSE: fetch failed
2025-06-27 15:36:46.248 [info] user-prometheus_dev: Connecting to SSE server
2025-06-27 15:36:46.264 [error] user-prometheus_dev: Client error for command SSE error: TypeError: fetch failed: getaddrinfo ENOTFOUND some-dev-server.com
2025-06-27 15:36:46.264 [error] user-prometheus_dev: Error connecting to SSE server after fallback: SSE error: TypeError: fetch failed: getaddrinfo ENOTFOUND some-dev-server.com
2025-06-27 15:36:46.264 [info] user-prometheus_dev: Client closed for command
2025-06-27 15:36:46.267 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 15:36:46.268 [error] user-prometheus_stage: No server info found
2025-06-27 15:36:46.268 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 15:36:46.268 [error] user-prometheus_nightly: No server info found
2025-06-27 15:36:46.268 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 15:36:46.268 [error] user-prometheus_dev: No server info found
2025-06-27 15:36:46.932 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 15:36:46.932 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 15:36:46.932 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 15:36:50.719 [error] user-prometheus_stage: Client error for command HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:50.720 [info] user-prometheus_stage: Client closed for command
2025-06-27 15:36:50.720 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:50.720 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:50.720 [info] user-prometheus_stage: Connecting to SSE server
2025-06-27 15:36:54.470 [error] user-prometheus_stage: Client error for command HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:54.471 [error] user-prometheus_stage: Error connecting to SSE server after fallback: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 15:36:54.471 [info] user-prometheus_stage: Client closed for command
2025-06-27 15:36:54.476 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 15:36:54.477 [error] user-prometheus_stage: No server info found
2025-06-27 15:36:54.477 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 15:36:54.477 [error] user-prometheus_nightly: No server info found
2025-06-27 15:36:54.480 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 15:36:54.481 [error] user-prometheus_dev: No server info found
2025-06-27 15:39:18.898 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 15:39:18.898 [error] user-prometheus_stage: No server info found
2025-06-27 15:39:18.899 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 15:39:18.899 [error] user-prometheus_nightly: No server info found
2025-06-27 15:39:18.902 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 15:39:18.902 [error] user-prometheus_dev: No server info found
2025-06-27 15:39:24.126 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 15:39:24.127 [error] user-prometheus_stage: No server info found
2025-06-27 15:39:24.127 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 15:39:24.127 [error] user-prometheus_nightly: No server info found
2025-06-27 15:39:24.129 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 15:39:24.129 [error] user-prometheus_dev: No server info found
2025-06-27 15:47:49.588 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 15:47:49.588 [error] user-prometheus_stage: No server info found
2025-06-27 15:47:49.589 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 15:47:49.589 [error] user-prometheus_nightly: No server info found
2025-06-27 15:47:49.590 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 15:47:49.590 [error] user-prometheus_dev: No server info found
2025-06-27 15:47:51.563 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 15:47:51.563 [error] user-prometheus_stage: No server info found
2025-06-27 15:47:51.564 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 15:47:51.564 [error] user-prometheus_nightly: No server info found
2025-06-27 15:47:51.565 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 15:47:51.565 [error] user-prometheus_dev: No server info found
</code></pre>
<blockquote>
<p>Note that <code>some-dev-server.com</code> is an example name (DNS / Domain name). I changed the DNS name in the logs to remove sensitive, confidential and private information. If you literally use <code>some-dev-server.com</code> in your MCP tools settings, it will coincidentally give the same logs mostly because it doesn’t point to anything - as in - there are no DNS records defined for it. You may not get the same logs if you use those and/ define those - that is, create DNS records for <code>some-dev-server.com</code></p>
<p>As of this writing <code>some-dev-server.com</code> is not an actual domain name that resolves to anything since it has no DNS records at this point of time</p>
</blockquote>
<p>You can see how there are a lot of logs. I can see some 401s, and what not. There’s even a <code>getaddrinfo</code> error</p>
<p>Previously, I saw some timeout errors. Then I used <code>curl</code> in my terminal noticed the name resolution error and then used <code>dig</code> to confirm. For example -</p>
<pre><code class="lang-plaintext">&gt; curl https://some-dev-server.com/prometheus/mcp
curl: (6) Could not resolve host: some-dev-server.com

&gt; dig some-dev-server.com

; &lt;&lt;&gt;&gt; DiG 9.10.6 &lt;&lt;&gt;&gt; some-dev-server.com
;; global options: +cmd
;; Got answer:
;; -&gt;&gt;HEADER&lt;&lt;- opcode: QUERY, status: NXDOMAIN, id: 39008
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;some-dev-server.com.        IN    A

;; AUTHORITY SECTION:
com.            882    IN    SOA    a.gtld-servers.net. nstld.verisign-grs.com. 1751020243 1800 900 604800 900

;; Query time: 2 msec
;; SERVER: 127.0.2.2#53(127.0.2.2)
;; WHEN: Fri Jun 27 16:01:31 IST 2025
;; MSG SIZE  rcvd: 124
</code></pre>
<p>Please note that when you use <code>curl</code> to debug, you need to use everything that you mention in your <code>mcp.json</code> - like the content headers etc which has some important information like any credentials for authentication and authorization. I say this because there can be authentication errors too, which sometimes get bubbled up to the AI Chat in the case of Cursor and it’s MCP Client when using the AI Chat in Cursor. Also, I have also noticed 5xx errors like <code>502</code> and also errors like 4xx like <code>401</code>.</p>
<p>Also, to make your debugging easier, you can shut the noise (logs) from other MCP tools by disabling them in your Cursor Settings, for example, like this -</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751020628797/be7c9884-0bea-4740-a588-7387bcdd215d.png" alt class="image--center mx-auto" /></p>
<p>I have enabled <code>stage</code> alone and I notice this in my logs</p>
<pre><code class="lang-plaintext">2025-06-27 16:05:01.549 [info] user-prometheus_stage: Handling DeleteClient action
2025-06-27 16:05:04.427 [info] user-prometheus_nightly: Handling DeleteClient action
2025-06-27 16:05:05.545 [info] user-prometheus_dev: Handling DeleteClient action
2025-06-27 16:05:06.396 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 16:05:06.396 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 16:05:06.396 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 16:05:10.163 [error] user-prometheus_stage: Client error for command HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:05:10.164 [info] user-prometheus_stage: Client closed for command
2025-06-27 16:05:10.165 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:05:10.165 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:05:10.165 [info] user-prometheus_stage: Connecting to SSE server
2025-06-27 16:05:13.757 [error] user-prometheus_stage: Client error for command HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:05:13.758 [error] user-prometheus_stage: Error connecting to SSE server after fallback: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:05:13.758 [info] user-prometheus_stage: Client closed for command
2025-06-27 16:05:13.763 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:05:13.763 [error] user-prometheus_stage: No server info found
2025-06-27 16:05:13.764 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:05:13.764 [error] user-prometheus_nightly: No server info found
2025-06-27 16:05:13.774 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:05:13.774 [error] user-prometheus_dev: No server info found
</code></pre>
<p>Here you can see how there’s a log saying <code>Handling DeleteClient action</code> - this is shown by the tool I think, when it gets disabled, so as to say that the client was deleted</p>
<p>And when you enable, it shows <code>Creating streamableHttp transport</code> - again, his is shown by the tool I think, when it gets enabled, so as to say that the client was created</p>
<p>You can see how there are 401 errors. So, maybe some access token issue. In my case, it was an access token issue in one of the tools and in another one it was a DNS issue and a routing issue. At times, there were some intermittent connection issues - like, due to some gateway issues from Cloudflare since we use Cloudflare as a gateway / firewall to our services, so, I have seen <code>502</code> gateway errors. All of these - I could find it from the logs and also using <code>curl</code></p>
<p>When I fix the access token issue in one of my tools, it looks like this -</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751021935482/f3d13694-709b-4b38-b448-19c9545e3596.png" alt class="image--center mx-auto" /></p>
<p>Notice the “green” mark - a green filled circle</p>
<p>A closer look by magnifying it by zooming in</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751022220817/c3ce741b-abd1-4141-9b42-4feb1894c76d.png" alt class="image--center mx-auto" /></p>
<p>If you click on <code>5 tools enabled</code> - it will show you the tools. For example, here’s an example for <code>prometheus_nightly</code>:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751027704678/9f31bf9e-85f0-4969-aebb-4a9e5e8245b7.png" alt class="image--center mx-auto" /></p>
<p>Sometimes I can see a “red” mark - a red filled circle, like this -</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751021991576/9d4ff189-2b20-46a2-be1a-39e936101cb4.png" alt class="image--center mx-auto" /></p>
<p>A closer look by magnifying it by zooming in</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751022051959/5fc23a6b-e5fd-4ab4-8013-b0a0acb81a24.png" alt class="image--center mx-auto" /></p>
<p>When I look at the logs, it says this -</p>
<pre><code class="lang-plaintext">2025-06-27 16:29:31.973 [error] user-prometheus_stage: Client error for command SSE stream disconnected: TypeError: terminated
</code></pre>
<p>It just says some error - some <code>TypeError</code> and it just says <code>terminated</code>. I think some connection termination or something. Also, note the SSE reference - looks like it uses SSE - Server Sent Events. Some references from Wikipedia and MDN (Mozilla Developer Network) Web APIs docs - <a target="_blank" href="https://en.wikipedia.org/wiki/Server-sent_events">https://en.wikipedia.org/wiki/Server-sent_events</a> , <a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events">https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events</a></p>
<p>Some example logs when I was messing with the network in my machine to disconnect from the private network where our MCP server is running</p>
<pre><code class="lang-plaintext">2025-06-27 16:27:48.893 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 16:27:48.893 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 16:27:48.893 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 16:27:48.921 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:27:48.921 [error] user-prometheus_stage: No server info found
2025-06-27 16:27:48.921 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:27:48.921 [error] user-prometheus_nightly: No server info found
2025-06-27 16:27:48.921 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:27:48.921 [error] user-prometheus_dev: No server info found
2025-06-27 16:27:50.100 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:27:50.100 [error] user-prometheus_stage: No server info found
2025-06-27 16:27:50.106 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:27:50.106 [error] user-prometheus_nightly: No server info found
2025-06-27 16:27:50.106 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:27:50.106 [error] user-prometheus_dev: No server info found
2025-06-27 16:27:52.343 [error] user-prometheus_stage: Client error for command HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:27:52.345 [info] user-prometheus_stage: Client closed for command
2025-06-27 16:27:52.347 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:27:52.347 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:27:52.347 [info] user-prometheus_stage: Connecting to SSE server
2025-06-27 16:27:54.615 [info] user-prometheus_nightly: Handling DeleteClient action
2025-06-27 16:27:55.422 [info] user-prometheus_dev: Handling DeleteClient action
2025-06-27 16:27:55.825 [error] user-prometheus_stage: Client error for command HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:27:55.826 [error] user-prometheus_stage: Error connecting to SSE server after fallback: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:27:55.826 [info] user-prometheus_stage: Client closed for command
2025-06-27 16:27:55.828 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:27:55.828 [error] user-prometheus_stage: No server info found
2025-06-27 16:27:55.828 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:27:55.828 [error] user-prometheus_stage: No server info found
2025-06-27 16:27:55.830 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:27:55.831 [error] user-prometheus_nightly: No server info found
2025-06-27 16:27:55.831 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:27:55.831 [error] user-prometheus_nightly: No server info found
2025-06-27 16:27:55.831 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:27:55.831 [error] user-prometheus_dev: No server info found
2025-06-27 16:27:55.831 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:27:55.831 [error] user-prometheus_dev: No server info found
2025-06-27 16:28:17.872 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:28:17.872 [error] user-prometheus_stage: No server info found
2025-06-27 16:28:17.873 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:28:17.873 [error] user-prometheus_nightly: No server info found
2025-06-27 16:28:17.874 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:28:17.874 [error] user-prometheus_dev: No server info found
2025-06-27 16:28:23.308 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:28:23.308 [error] user-prometheus_stage: No server info found
2025-06-27 16:28:23.309 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:28:23.309 [error] user-prometheus_nightly: No server info found
2025-06-27 16:28:23.311 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:28:23.311 [error] user-prometheus_dev: No server info found
2025-06-27 16:28:26.884 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 16:28:26.884 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 16:28:26.884 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 16:28:26.893 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 16:28:26.893 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 16:28:26.893 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 16:28:30.115 [info] user-prometheus_stage: Successfully connected to streamableHttp server
2025-06-27 16:28:30.115 [info] user-prometheus_stage: Storing streamableHttp client
2025-06-27 16:28:30.117 [info] user-prometheus_stage: Successfully connected to streamableHttp server
2025-06-27 16:28:30.117 [info] user-prometheus_stage: A second client was created while connecting, discarding it.
2025-06-27 16:28:30.129 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:28:30.129 [info] user-prometheus_stage: Listing offerings
2025-06-27 16:28:30.130 [info] user-prometheus_stage: Connected to streamableHttp server, fetching offerings
2025-06-27 16:28:30.130 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:28:30.130 [info] user-prometheus_stage: Listing offerings
2025-06-27 16:28:30.131 [info] user-prometheus_stage: Connected to streamableHttp server, fetching offerings
2025-06-27 16:28:30.132 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:28:30.133 [info] user-prometheus_stage: Listing offerings
2025-06-27 16:28:30.133 [info] user-prometheus_stage: Connected to streamableHttp server, fetching offerings
2025-06-27 16:28:30.133 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:28:30.133 [info] user-prometheus_stage: Listing offerings
2025-06-27 16:28:30.133 [info] user-prometheus_stage: Connected to streamableHttp server, fetching offerings
2025-06-27 16:28:32.543 [info] listOfferings: Found 5 tools
2025-06-27 16:28:32.543 [info] user-prometheus_stage: Found 5 tools
2025-06-27 16:28:32.545 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:28:32.545 [error] user-prometheus_nightly: No server info found
2025-06-27 16:28:32.547 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:28:32.548 [error] user-prometheus_dev: No server info found
2025-06-27 16:28:32.551 [info] listOfferings: Found 5 tools
2025-06-27 16:28:32.551 [info] user-prometheus_stage: Found 5 tools
2025-06-27 16:28:32.553 [info] listOfferings: Found 5 tools
2025-06-27 16:28:32.553 [info] user-prometheus_stage: Found 5 tools
2025-06-27 16:28:32.554 [info] listOfferings: Found 5 tools
2025-06-27 16:28:32.554 [info] user-prometheus_stage: Found 5 tools
2025-06-27 16:28:32.554 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:28:32.554 [error] user-prometheus_nightly: No server info found
2025-06-27 16:28:32.560 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:28:32.560 [error] user-prometheus_nightly: No server info found
2025-06-27 16:28:32.560 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:28:32.560 [error] user-prometheus_nightly: No server info found
2025-06-27 16:28:32.561 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:28:32.561 [error] user-prometheus_dev: No server info found
2025-06-27 16:28:32.561 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:28:32.561 [error] user-prometheus_dev: No server info found
2025-06-27 16:28:32.564 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:28:32.565 [error] user-prometheus_dev: No server info found
2025-06-27 16:29:31.973 [error] user-prometheus_stage: Client error for command SSE stream disconnected: TypeError: terminated
2025-06-27 16:33:22.428 [info] user-prometheus_stage: Handling DeleteClient action
2025-06-27 16:33:22.428 [info] user-prometheus_stage: Cleaning up
2025-06-27 16:33:23.178 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 16:33:23.178 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 16:33:23.178 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 16:33:25.659 [info] user-prometheus_stage: Successfully connected to streamableHttp server
2025-06-27 16:33:25.659 [info] user-prometheus_stage: Storing streamableHttp client
2025-06-27 16:33:25.672 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:33:25.672 [info] user-prometheus_stage: Listing offerings
2025-06-27 16:33:25.672 [info] user-prometheus_stage: Connected to streamableHttp server, fetching offerings
2025-06-27 16:33:27.295 [info] listOfferings: Found 5 tools
2025-06-27 16:33:27.295 [info] user-prometheus_stage: Found 5 tools
2025-06-27 16:33:27.297 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:33:27.297 [error] user-prometheus_nightly: No server info found
2025-06-27 16:33:27.304 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:33:27.305 [error] user-prometheus_dev: No server info found
2025-06-27 16:34:18.011 [info] user-prometheus_stage: Handling DeleteClient action
2025-06-27 16:34:18.012 [info] user-prometheus_stage: Cleaning up
2025-06-27 16:34:18.711 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 16:34:18.712 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 16:34:18.712 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 16:34:28.972 [error] user-prometheus_stage: Client error for command fetch failed
2025-06-27 16:34:28.972 [info] user-prometheus_stage: Client closed for command
2025-06-27 16:34:28.973 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: fetch failed
2025-06-27 16:34:28.973 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: fetch failed
2025-06-27 16:34:28.973 [info] user-prometheus_stage: Connecting to SSE server
2025-06-27 16:34:39.550 [error] user-prometheus_stage: Client error for command SSE error: TypeError: fetch failed: Connect Timeout Error (attempted address: some-staging-server.com:443, timeout: 10000ms)
2025-06-27 16:34:39.551 [error] user-prometheus_stage: Error connecting to SSE server after fallback: SSE error: TypeError: fetch failed: Connect Timeout Error (attempted address: some-staging-server.com:443, timeout: 10000ms)
2025-06-27 16:34:39.551 [info] user-prometheus_stage: Client closed for command
2025-06-27 16:34:39.556 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:34:39.556 [error] user-prometheus_stage: No server info found
2025-06-27 16:34:39.562 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:34:39.562 [error] user-prometheus_nightly: No server info found
2025-06-27 16:34:39.566 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:34:39.566 [error] user-prometheus_dev: No server info found
</code></pre>
<blockquote>
<p>Note that you may not get the same logs since as of this writing <code>some-staging-server</code> is not an actual domain name that resolves to anything since it has no DNS records at this point of time</p>
</blockquote>
<p>In some cases, I have noticed new errors too, for example, after fixing the network issues I created, I disabled and enabled the MCP tool, but there was still some error due to some reason. Look at this -</p>
<pre><code class="lang-plaintext">2025-06-27 16:27:48.893 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 16:27:48.893 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 16:27:48.893 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 16:27:48.921 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:27:48.921 [error] user-prometheus_stage: No server info found
2025-06-27 16:27:48.921 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:27:48.921 [error] user-prometheus_nightly: No server info found
2025-06-27 16:27:48.921 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:27:48.921 [error] user-prometheus_dev: No server info found
2025-06-27 16:27:50.100 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:27:50.100 [error] user-prometheus_stage: No server info found
2025-06-27 16:27:50.106 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:27:50.106 [error] user-prometheus_nightly: No server info found
2025-06-27 16:27:50.106 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:27:50.106 [error] user-prometheus_dev: No server info found
2025-06-27 16:27:52.343 [error] user-prometheus_stage: Client error for command HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:27:52.345 [info] user-prometheus_stage: Client closed for command
2025-06-27 16:27:52.347 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:27:52.347 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:27:52.347 [info] user-prometheus_stage: Connecting to SSE server
2025-06-27 16:27:54.615 [info] user-prometheus_nightly: Handling DeleteClient action
2025-06-27 16:27:55.422 [info] user-prometheus_dev: Handling DeleteClient action
2025-06-27 16:27:55.825 [error] user-prometheus_stage: Client error for command HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:27:55.826 [error] user-prometheus_stage: Error connecting to SSE server after fallback: HTTP 401 trying to load well-known OAuth metadata
2025-06-27 16:27:55.826 [info] user-prometheus_stage: Client closed for command
2025-06-27 16:27:55.828 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:27:55.828 [error] user-prometheus_stage: No server info found
2025-06-27 16:27:55.828 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:27:55.828 [error] user-prometheus_stage: No server info found
2025-06-27 16:27:55.830 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:27:55.831 [error] user-prometheus_nightly: No server info found
2025-06-27 16:27:55.831 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:27:55.831 [error] user-prometheus_nightly: No server info found
2025-06-27 16:27:55.831 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:27:55.831 [error] user-prometheus_dev: No server info found
2025-06-27 16:27:55.831 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:27:55.831 [error] user-prometheus_dev: No server info found
2025-06-27 16:28:17.872 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:28:17.872 [error] user-prometheus_stage: No server info found
2025-06-27 16:28:17.873 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:28:17.873 [error] user-prometheus_nightly: No server info found
2025-06-27 16:28:17.874 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:28:17.874 [error] user-prometheus_dev: No server info found
2025-06-27 16:28:23.308 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:28:23.308 [error] user-prometheus_stage: No server info found
2025-06-27 16:28:23.309 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:28:23.309 [error] user-prometheus_nightly: No server info found
2025-06-27 16:28:23.311 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:28:23.311 [error] user-prometheus_dev: No server info found
2025-06-27 16:28:26.884 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 16:28:26.884 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 16:28:26.884 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 16:28:26.893 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 16:28:26.893 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 16:28:26.893 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 16:28:30.115 [info] user-prometheus_stage: Successfully connected to streamableHttp server
2025-06-27 16:28:30.115 [info] user-prometheus_stage: Storing streamableHttp client
2025-06-27 16:28:30.117 [info] user-prometheus_stage: Successfully connected to streamableHttp server
2025-06-27 16:28:30.117 [info] user-prometheus_stage: A second client was created while connecting, discarding it.
2025-06-27 16:28:30.129 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:28:30.129 [info] user-prometheus_stage: Listing offerings
2025-06-27 16:28:30.130 [info] user-prometheus_stage: Connected to streamableHttp server, fetching offerings
2025-06-27 16:28:30.130 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:28:30.130 [info] user-prometheus_stage: Listing offerings
2025-06-27 16:28:30.131 [info] user-prometheus_stage: Connected to streamableHttp server, fetching offerings
2025-06-27 16:28:30.132 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:28:30.133 [info] user-prometheus_stage: Listing offerings
2025-06-27 16:28:30.133 [info] user-prometheus_stage: Connected to streamableHttp server, fetching offerings
2025-06-27 16:28:30.133 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:28:30.133 [info] user-prometheus_stage: Listing offerings
2025-06-27 16:28:30.133 [info] user-prometheus_stage: Connected to streamableHttp server, fetching offerings
2025-06-27 16:28:32.543 [info] listOfferings: Found 5 tools
2025-06-27 16:28:32.543 [info] user-prometheus_stage: Found 5 tools
2025-06-27 16:28:32.545 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:28:32.545 [error] user-prometheus_nightly: No server info found
2025-06-27 16:28:32.547 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:28:32.548 [error] user-prometheus_dev: No server info found
2025-06-27 16:28:32.551 [info] listOfferings: Found 5 tools
2025-06-27 16:28:32.551 [info] user-prometheus_stage: Found 5 tools
2025-06-27 16:28:32.553 [info] listOfferings: Found 5 tools
2025-06-27 16:28:32.553 [info] user-prometheus_stage: Found 5 tools
2025-06-27 16:28:32.554 [info] listOfferings: Found 5 tools
2025-06-27 16:28:32.554 [info] user-prometheus_stage: Found 5 tools
2025-06-27 16:28:32.554 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:28:32.554 [error] user-prometheus_nightly: No server info found
2025-06-27 16:28:32.560 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:28:32.560 [error] user-prometheus_nightly: No server info found
2025-06-27 16:28:32.560 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:28:32.560 [error] user-prometheus_nightly: No server info found
2025-06-27 16:28:32.561 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:28:32.561 [error] user-prometheus_dev: No server info found
2025-06-27 16:28:32.561 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:28:32.561 [error] user-prometheus_dev: No server info found
2025-06-27 16:28:32.564 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:28:32.565 [error] user-prometheus_dev: No server info found
2025-06-27 16:29:31.973 [error] user-prometheus_stage: Client error for command SSE stream disconnected: TypeError: terminated
2025-06-27 16:33:22.428 [info] user-prometheus_stage: Handling DeleteClient action
2025-06-27 16:33:22.428 [info] user-prometheus_stage: Cleaning up
2025-06-27 16:33:23.178 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 16:33:23.178 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 16:33:23.178 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 16:33:25.659 [info] user-prometheus_stage: Successfully connected to streamableHttp server
2025-06-27 16:33:25.659 [info] user-prometheus_stage: Storing streamableHttp client
2025-06-27 16:33:25.672 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:33:25.672 [info] user-prometheus_stage: Listing offerings
2025-06-27 16:33:25.672 [info] user-prometheus_stage: Connected to streamableHttp server, fetching offerings
2025-06-27 16:33:27.295 [info] listOfferings: Found 5 tools
2025-06-27 16:33:27.295 [info] user-prometheus_stage: Found 5 tools
2025-06-27 16:33:27.297 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:33:27.297 [error] user-prometheus_nightly: No server info found
2025-06-27 16:33:27.304 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:33:27.305 [error] user-prometheus_dev: No server info found
2025-06-27 16:34:18.011 [info] user-prometheus_stage: Handling DeleteClient action
2025-06-27 16:34:18.012 [info] user-prometheus_stage: Cleaning up
2025-06-27 16:34:18.711 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 16:34:18.712 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 16:34:18.712 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 16:34:28.972 [error] user-prometheus_stage: Client error for command fetch failed
2025-06-27 16:34:28.972 [info] user-prometheus_stage: Client closed for command
2025-06-27 16:34:28.973 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: fetch failed
2025-06-27 16:34:28.973 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: fetch failed
2025-06-27 16:34:28.973 [info] user-prometheus_stage: Connecting to SSE server
2025-06-27 16:34:39.550 [error] user-prometheus_stage: Client error for command SSE error: TypeError: fetch failed: Connect Timeout Error (attempted address: some-staging-server.com, timeout: 10000ms)
2025-06-27 16:34:39.551 [error] user-prometheus_stage: Error connecting to SSE server after fallback: SSE error: TypeError: fetch failed: Connect Timeout Error (attempted address: some-staging-server.com, timeout: 10000ms)
2025-06-27 16:34:39.551 [info] user-prometheus_stage: Client closed for command
2025-06-27 16:34:39.556 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:34:39.556 [error] user-prometheus_stage: No server info found
2025-06-27 16:34:39.562 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:34:39.562 [error] user-prometheus_nightly: No server info found
2025-06-27 16:34:39.566 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:34:39.566 [error] user-prometheus_dev: No server info found
2025-06-27 16:35:42.035 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:35:42.035 [error] user-prometheus_stage: No server info found
2025-06-27 16:35:42.036 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:35:42.036 [error] user-prometheus_nightly: No server info found
2025-06-27 16:35:42.037 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:35:42.037 [error] user-prometheus_dev: No server info found
2025-06-27 16:59:05.015 [info] user-prometheus_stage: Handling DeleteClient action
2025-06-27 16:59:05.618 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 16:59:05.618 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 16:59:05.618 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 16:59:41.350 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 16:59:41.350 [error] user-prometheus_stage: No server info found
2025-06-27 16:59:41.351 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 16:59:41.351 [error] user-prometheus_nightly: No server info found
2025-06-27 16:59:41.352 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 16:59:41.352 [error] user-prometheus_dev: No server info found
2025-06-27 17:00:05.627 [info] user-prometheus_stage: Client closed for command
2025-06-27 17:00:05.627 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: MCP error -32001: Request timed out
2025-06-27 17:00:05.627 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: MCP error -32001: Request timed out
2025-06-27 17:00:05.627 [info] user-prometheus_stage: Connecting to SSE server
2025-06-27 17:00:05.628 [error] user-prometheus_stage: Client error for command This operation was aborted
2025-06-27 17:00:05.637 [error] user-prometheus_stage: Client error for command This operation was aborted
2025-06-27 17:00:05.637 [error] user-prometheus_stage: Client error for command Failed to send cancellation: AbortError: This operation was aborted
2025-06-27 17:00:08.060 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:00:08.060 [error] user-prometheus_stage: No server info found
2025-06-27 17:00:08.060 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:00:08.061 [error] user-prometheus_nightly: No server info found
2025-06-27 17:00:08.061 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:00:08.061 [error] user-prometheus_dev: No server info found
2025-06-27 17:00:08.848 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:00:08.848 [error] user-prometheus_stage: No server info found
2025-06-27 17:00:08.849 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:00:08.849 [error] user-prometheus_nightly: No server info found
2025-06-27 17:00:08.850 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:00:08.850 [error] user-prometheus_dev: No server info found
2025-06-27 17:00:10.176 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:00:10.176 [error] user-prometheus_stage: No server info found
2025-06-27 17:00:10.177 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:00:10.177 [error] user-prometheus_stage: No server info found
2025-06-27 17:00:10.178 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:00:10.178 [error] user-prometheus_stage: No server info found
2025-06-27 17:00:10.178 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:00:10.178 [error] user-prometheus_nightly: No server info found
2025-06-27 17:00:10.179 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:00:10.179 [error] user-prometheus_nightly: No server info found
2025-06-27 17:00:10.179 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:00:10.179 [error] user-prometheus_nightly: No server info found
2025-06-27 17:00:10.180 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:00:10.180 [error] user-prometheus_dev: No server info found
2025-06-27 17:00:10.180 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:00:10.180 [error] user-prometheus_dev: No server info found
2025-06-27 17:00:10.180 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:00:10.180 [error] user-prometheus_dev: No server info found
2025-06-27 17:00:13.676 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:00:13.676 [error] user-prometheus_stage: No server info found
2025-06-27 17:00:13.677 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:00:13.677 [error] user-prometheus_stage: No server info found
2025-06-27 17:00:13.677 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:00:13.677 [error] user-prometheus_nightly: No server info found
2025-06-27 17:00:13.678 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:00:13.678 [error] user-prometheus_nightly: No server info found
2025-06-27 17:00:13.680 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:00:13.680 [error] user-prometheus_dev: No server info found
2025-06-27 17:00:13.681 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:00:13.681 [error] user-prometheus_dev: No server info found
2025-06-27 17:00:16.628 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:00:16.628 [error] user-prometheus_stage: No server info found
2025-06-27 17:00:16.629 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:00:16.629 [error] user-prometheus_nightly: No server info found
2025-06-27 17:00:16.631 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:00:16.631 [error] user-prometheus_dev: No server info found
2025-06-27 17:01:06.431 [error] user-prometheus_stage: Client error for command SSE error: Non-200 status code (504)
2025-06-27 17:01:06.431 [error] user-prometheus_stage: Error connecting to SSE server after fallback: SSE error: Non-200 status code (504)
2025-06-27 17:01:06.432 [info] user-prometheus_stage: Client closed for command
2025-06-27 17:01:06.437 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:01:06.437 [error] user-prometheus_stage: No server info found
2025-06-27 17:01:06.447 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:01:06.447 [error] user-prometheus_nightly: No server info found
2025-06-27 17:01:06.450 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:01:06.450 [error] user-prometheus_dev: No server info found
2025-06-27 17:02:03.770 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:02:03.771 [error] user-prometheus_stage: No server info found
2025-06-27 17:02:03.771 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:02:03.771 [error] user-prometheus_nightly: No server info found
2025-06-27 17:02:03.772 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:02:03.772 [error] user-prometheus_dev: No server info found
2025-06-27 17:02:20.189 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:02:20.190 [error] user-prometheus_stage: No server info found
2025-06-27 17:02:20.191 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:02:20.191 [error] user-prometheus_stage: No server info found
2025-06-27 17:02:20.192 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:02:20.192 [error] user-prometheus_stage: No server info found
2025-06-27 17:02:20.192 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:02:20.192 [error] user-prometheus_nightly: No server info found
2025-06-27 17:02:20.193 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:02:20.193 [error] user-prometheus_nightly: No server info found
2025-06-27 17:02:20.193 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:02:20.193 [error] user-prometheus_nightly: No server info found
2025-06-27 17:02:20.194 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:02:20.194 [error] user-prometheus_dev: No server info found
2025-06-27 17:02:20.195 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:02:20.195 [error] user-prometheus_dev: No server info found
2025-06-27 17:02:20.195 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:02:20.195 [error] user-prometheus_dev: No server info found
2025-06-27 17:02:22.766 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:02:22.766 [error] user-prometheus_stage: No server info found
2025-06-27 17:02:22.767 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:02:22.767 [error] user-prometheus_stage: No server info found
2025-06-27 17:02:22.767 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:02:22.767 [error] user-prometheus_nightly: No server info found
2025-06-27 17:02:22.768 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:02:22.768 [error] user-prometheus_nightly: No server info found
2025-06-27 17:02:22.768 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:02:22.768 [error] user-prometheus_dev: No server info found
2025-06-27 17:02:22.768 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:02:22.768 [error] user-prometheus_dev: No server info found
2025-06-27 17:02:27.308 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:02:27.309 [error] user-prometheus_stage: No server info found
2025-06-27 17:02:27.311 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:02:27.311 [error] user-prometheus_nightly: No server info found
2025-06-27 17:02:27.312 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:02:27.312 [error] user-prometheus_dev: No server info found
2025-06-27 17:04:02.534 [info] user-prometheus_stage: Handling DeleteClient action
2025-06-27 17:04:03.669 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 17:04:03.669 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 17:04:03.669 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 17:04:24.017 [info] user-prometheus_stage: Handling DeleteClient action
2025-06-27 17:04:24.906 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 17:04:24.906 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 17:04:24.906 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 17:04:35.388 [error] user-prometheus_stage: Client error for command fetch failed
2025-06-27 17:04:35.389 [info] user-prometheus_stage: Client closed for command
2025-06-27 17:04:35.389 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: fetch failed
2025-06-27 17:04:35.389 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: fetch failed
2025-06-27 17:04:35.390 [info] user-prometheus_stage: Connecting to SSE server
2025-06-27 17:04:45.946 [error] user-prometheus_stage: Client error for command SSE error: TypeError: fetch failed: Connect Timeout Error (attempted address: some-staging-server.com, timeout: 10000ms)
2025-06-27 17:04:45.947 [error] user-prometheus_stage: Error connecting to SSE server after fallback: SSE error: TypeError: fetch failed: Connect Timeout Error (attempted address: some-staging-server.com, timeout: 10000ms)
2025-06-27 17:04:45.947 [info] user-prometheus_stage: Client closed for command
2025-06-27 17:04:45.951 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:04:45.951 [error] user-prometheus_stage: No server info found
2025-06-27 17:04:45.952 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:04:45.952 [error] user-prometheus_nightly: No server info found
2025-06-27 17:04:45.952 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:04:45.953 [error] user-prometheus_dev: No server info found
2025-06-27 17:04:56.954 [info] user-prometheus_stage: Handling DeleteClient action
2025-06-27 17:04:58.118 [info] user-prometheus_stage: Handling CreateClient action
2025-06-27 17:04:58.118 [info] user-prometheus_stage: Creating streamableHttp transport
2025-06-27 17:04:58.118 [info] user-prometheus_stage: Connecting to streamableHttp server
2025-06-27 17:05:03.678 [info] user-prometheus_stage: Client closed for command
2025-06-27 17:05:03.678 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: MCP error -32001: Request timed out
2025-06-27 17:05:03.679 [error] user-prometheus_stage: Error connecting to streamableHttp server, falling back to SSE: MCP error -32001: Request timed out
2025-06-27 17:05:03.679 [info] user-prometheus_stage: Connecting to SSE server
2025-06-27 17:05:03.679 [error] user-prometheus_stage: Client error for command This operation was aborted
2025-06-27 17:05:03.690 [error] user-prometheus_stage: Client error for command This operation was aborted
2025-06-27 17:05:03.690 [error] user-prometheus_stage: Client error for command Failed to send cancellation: AbortError: This operation was aborted
</code></pre>
<blockquote>
<p>Note that you may not get the same logs since as of this writing <code>some-staging-server</code> is not an actual domain name that resolves to anything since it has no DNS records at this point of time</p>
</blockquote>
<p>When things work, the logs show the number of tools that MCP has, for example, something like this -</p>
<pre><code class="lang-plaintext">2025-06-27 17:33:46.638 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:33:46.638 [error] user-prometheus_stage: No server info found
2025-06-27 17:33:46.641 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:33:46.641 [error] user-prometheus_nightly: No server info found
2025-06-27 17:33:46.643 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:33:46.643 [error] user-prometheus_dev: No server info found
2025-06-27 17:33:49.058 [info] user-prometheus_nightly: Handling CreateClient action
2025-06-27 17:33:49.058 [info] user-prometheus_nightly: Creating streamableHttp transport
2025-06-27 17:33:49.058 [info] user-prometheus_nightly: Connecting to streamableHttp server
2025-06-27 17:33:51.960 [info] user-prometheus_nightly: Successfully connected to streamableHttp server
2025-06-27 17:33:51.960 [info] user-prometheus_nightly: Storing streamableHttp client
2025-06-27 17:33:51.973 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:33:51.973 [error] user-prometheus_stage: No server info found
2025-06-27 17:33:51.975 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:33:51.975 [info] user-prometheus_nightly: Listing offerings
2025-06-27 17:33:51.976 [info] user-prometheus_nightly: Connected to streamableHttp server, fetching offerings
2025-06-27 17:33:53.415 [info] listOfferings: Found 5 tools
2025-06-27 17:33:53.416 [info] user-prometheus_nightly: Found 5 tools
2025-06-27 17:33:53.417 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:33:53.417 [error] user-prometheus_dev: No server info found
2025-06-27 17:33:57.333 [error] user-prometheus_stage: Client error for command SSE error: Non-200 status code (504)
2025-06-27 17:33:57.334 [error] user-prometheus_stage: Error connecting to SSE server after fallback: SSE error: Non-200 status code (504)
2025-06-27 17:33:57.334 [info] user-prometheus_stage: Client closed for command
2025-06-27 17:33:57.336 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:33:57.336 [error] user-prometheus_stage: No server info found
2025-06-27 17:33:57.339 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:33:57.339 [info] user-prometheus_nightly: Listing offerings
2025-06-27 17:33:57.339 [info] user-prometheus_nightly: Connected to streamableHttp server, fetching offerings
2025-06-27 17:33:58.533 [info] listOfferings: Found 5 tools
2025-06-27 17:33:58.533 [info] user-prometheus_nightly: Found 5 tools
2025-06-27 17:33:58.535 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:33:58.535 [error] user-prometheus_dev: No server info found
2025-06-27 17:34:41.878 [error] user-prometheus_stage: Client error for command SSE error: Non-200 status code (504)
2025-06-27 17:34:41.878 [error] user-prometheus_stage: Error connecting to SSE server after fallback: SSE error: Non-200 status code (504)
2025-06-27 17:34:41.879 [info] user-prometheus_stage: Client closed for command
2025-06-27 17:34:41.880 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:34:41.881 [error] user-prometheus_stage: No server info found
2025-06-27 17:34:41.882 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:34:41.882 [info] user-prometheus_nightly: Listing offerings
2025-06-27 17:34:41.882 [info] user-prometheus_nightly: Connected to streamableHttp server, fetching offerings
2025-06-27 17:34:43.175 [info] listOfferings: Found 5 tools
2025-06-27 17:34:43.175 [info] user-prometheus_nightly: Found 5 tools
2025-06-27 17:34:43.177 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:34:43.177 [error] user-prometheus_dev: No server info found
2025-06-27 17:34:53.798 [error] user-prometheus_nightly: Client error for command SSE stream disconnected: TypeError: terminated
</code></pre>
<p>Notice the <code>Found 5 tools</code> log. But yeah, later the connection failed I think, but everything still works, as you can see here in the logs -</p>
<pre><code class="lang-plaintext">2025-06-27 17:33:46.638 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:33:46.638 [error] user-prometheus_stage: No server info found
2025-06-27 17:33:46.641 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:33:46.641 [error] user-prometheus_nightly: No server info found
2025-06-27 17:33:46.643 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:33:46.643 [error] user-prometheus_dev: No server info found
2025-06-27 17:33:49.058 [info] user-prometheus_nightly: Handling CreateClient action
2025-06-27 17:33:49.058 [info] user-prometheus_nightly: Creating streamableHttp transport
2025-06-27 17:33:49.058 [info] user-prometheus_nightly: Connecting to streamableHttp server
2025-06-27 17:33:51.960 [info] user-prometheus_nightly: Successfully connected to streamableHttp server
2025-06-27 17:33:51.960 [info] user-prometheus_nightly: Storing streamableHttp client
2025-06-27 17:33:51.973 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:33:51.973 [error] user-prometheus_stage: No server info found
2025-06-27 17:33:51.975 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:33:51.975 [info] user-prometheus_nightly: Listing offerings
2025-06-27 17:33:51.976 [info] user-prometheus_nightly: Connected to streamableHttp server, fetching offerings
2025-06-27 17:33:53.415 [info] listOfferings: Found 5 tools
2025-06-27 17:33:53.416 [info] user-prometheus_nightly: Found 5 tools
2025-06-27 17:33:53.417 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:33:53.417 [error] user-prometheus_dev: No server info found
2025-06-27 17:33:57.333 [error] user-prometheus_stage: Client error for command SSE error: Non-200 status code (504)
2025-06-27 17:33:57.334 [error] user-prometheus_stage: Error connecting to SSE server after fallback: SSE error: Non-200 status code (504)
2025-06-27 17:33:57.334 [info] user-prometheus_stage: Client closed for command
2025-06-27 17:33:57.336 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:33:57.336 [error] user-prometheus_stage: No server info found
2025-06-27 17:33:57.339 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:33:57.339 [info] user-prometheus_nightly: Listing offerings
2025-06-27 17:33:57.339 [info] user-prometheus_nightly: Connected to streamableHttp server, fetching offerings
2025-06-27 17:33:58.533 [info] listOfferings: Found 5 tools
2025-06-27 17:33:58.533 [info] user-prometheus_nightly: Found 5 tools
2025-06-27 17:33:58.535 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:33:58.535 [error] user-prometheus_dev: No server info found
2025-06-27 17:34:41.878 [error] user-prometheus_stage: Client error for command SSE error: Non-200 status code (504)
2025-06-27 17:34:41.878 [error] user-prometheus_stage: Error connecting to SSE server after fallback: SSE error: Non-200 status code (504)
2025-06-27 17:34:41.879 [info] user-prometheus_stage: Client closed for command
2025-06-27 17:34:41.880 [info] user-prometheus_stage: Handling ListOfferings action
2025-06-27 17:34:41.881 [error] user-prometheus_stage: No server info found
2025-06-27 17:34:41.882 [info] user-prometheus_nightly: Handling ListOfferings action
2025-06-27 17:34:41.882 [info] user-prometheus_nightly: Listing offerings
2025-06-27 17:34:41.882 [info] user-prometheus_nightly: Connected to streamableHttp server, fetching offerings
2025-06-27 17:34:43.175 [info] listOfferings: Found 5 tools
2025-06-27 17:34:43.175 [info] user-prometheus_nightly: Found 5 tools
2025-06-27 17:34:43.177 [info] user-prometheus_dev: Handling ListOfferings action
2025-06-27 17:34:43.177 [error] user-prometheus_dev: No server info found
2025-06-27 17:34:53.798 [error] user-prometheus_nightly: Client error for command SSE stream disconnected: TypeError: terminated
2025-06-27 17:36:08.790 [info] user-prometheus_nightly: Handling CallTool action for tool 'execute_query'
2025-06-27 17:36:08.790 [info] user-prometheus_nightly: Calling tool 'execute_query' with toolCallId: toolu_01LKTmjdoV6uRaaDVX5cF624
2025-06-27 17:36:10.133 [info] user-prometheus_nightly: Successfully called tool 'execute_query'
2025-06-27 17:36:25.003 [info] user-prometheus_nightly: Handling CallTool action for tool 'execute_query'
2025-06-27 17:36:25.004 [info] user-prometheus_nightly: Calling tool 'execute_query' with toolCallId: toolu_01A4T9gaGmDQ1iCZhzr4oiZ5
2025-06-27 17:36:26.435 [info] user-prometheus_nightly: Successfully called tool 'execute_query'
</code></pre>
<p>When it’s green, it shows something like this:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751027704678/9f31bf9e-85f0-4969-aebb-4a9e5e8245b7.png" alt class="image--center mx-auto" /></p>
<p>Notice how it says the 5 tools are <code>execute_query</code>, <code>execute_range_query</code>, <code>list_metrics</code>, <code>get_metrics_metadata</code>, <code>get_targets</code></p>
<p>And also, the AI Chat is also working and is able to use the MCP tool among the MCP tools</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751026112203/7495f14b-de15-4abc-8b29-98c9419f412a.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751026147614/9300b4f9-6f46-42e3-bece-5c7c83121f74.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751026169040/9be4b96b-7d89-432b-8c7a-1a4c9814030e.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751026666723/1e3c4f76-4f95-4ffc-b665-45c78163d18b.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751026946784/8763cbf5-add7-421b-a218-12b0cc2325d9.png" alt class="image--center mx-auto" /></p>
<p>As we can see, the AI picked up the tool and executed things (based on available tools) in it</p>
<p>Also, note that I’m using Cursor AI, it’s AI Chat box, with Claude-4-sonnet with thinking enabled with MAX Mode enabled</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751027002215/b0ac8c04-1818-4165-aa97-d9a709ef74c5.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751027046854/82647f76-68d9-4dc4-a391-9673bada06b7.png" alt class="image--center mx-auto" /></p>
<p>I’m also using <code>mcp.json</code> as my context. And my query was just simply said as <code>get number of nodes</code> . I didn’t even mention the environment etc</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751027070212/699b5f6e-19fb-4b9e-a184-632275b45d92.png" alt class="image--center mx-auto" /></p>
<p>Also, I’m using Agent mode, not Ask mode or Manual mode</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751027091798/ea64f9bd-dcef-45ac-b495-e2184c33a8ef.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751027118031/0129a235-8ae3-41aa-9c08-cc5a413419de.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751027137823/537ce8eb-cae3-4d29-a29a-3f0fa968a5bd.png" alt class="image--center mx-auto" /></p>
<p>Note that though in this chat the AI did well, I have noticed in many chats that for the same get node count query, the same thing again and again. In those cases, it kept executing <code>list_metrics</code> again and again and thinking the same thing again and again. Maybe some AI backend error (in Claude for example), not sure though</p>
]]></content:encoded></item><item><title><![CDATA[A Small Generic Guide For A Software Engineer Career. Part 1]]></title><description><![CDATA[I have been working professionally since mid 2017, starting with freelancing and then working at companies from the start of 2018
Over the years, these are some of the things that I have learned

There’s always room for improvement

I have always fel...]]></description><link>https://karuppiah.dev/a-small-generic-guide-for-a-software-engineer-career-part-1</link><guid isPermaLink="true">https://karuppiah.dev/a-small-generic-guide-for-a-software-engineer-career-part-1</guid><category><![CDATA[software development]]></category><category><![CDATA[Software Engineering]]></category><category><![CDATA[software architecture]]></category><category><![CDATA[software]]></category><category><![CDATA[Software]]></category><category><![CDATA[Career]]></category><category><![CDATA[career advice]]></category><category><![CDATA[Open Source]]></category><category><![CDATA[Open source software]]></category><category><![CDATA[Open Source Community]]></category><category><![CDATA[Free Open Source Software ]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Tue, 10 Jun 2025 08:26:08 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/Hcfwew744z4/upload/24cec9879a76b69287b7129a1ac3b358.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I have been working professionally since mid 2017, starting with freelancing and then working at companies from the start of 2018</p>
<p>Over the years, these are some of the things that I have learned</p>
<blockquote>
<p>There’s always room for improvement</p>
</blockquote>
<p>I have always felt this, with respect to my career. I feel like I failed myself by not doing enough, since I could have done so many things! A recent conversation with a senior of mine about this lead to him confessing that he feels the same too and that this feeling is always gonna be there, and to kind of denote that it’s important to be okay with it</p>
<p>Apart from the regret of not doing enough, I think this idea can be taken in a positive manner, to always look for more opportunities in any job you are in. For example, these are some of the things I could have done:</p>
<blockquote>
<p>Read Code</p>
</blockquote>
<p>Read code. Seriously! There are a lot of people who write a lot of code, especially beginners, either by themselves or using AI. The thing is, it’s important to read more and more code, to be able to understand code, gain new knowledge, gain new perspectives, understand nitty gritties, notice and discover interesting patterns</p>
<p>If you are a newbie, start by looking at any source code that you can find in a programming language that you understand. It’s okay even if it’s not “good "code” or if it’s “bad code”. You will notice how people write code, what standards they follow, how they do error handling, how they design the software project - structuring the files, breaking down the software into smaller parts and writing them as functions or methods (in classes or similar). If you are in a company, look at the source code of different projects of the company. If you are not yet professionally working, just take a look at the source code of open source projects. There are many great open source projects out there</p>
<blockquote>
<p>Look for Patterns</p>
</blockquote>
<p>Whatever you do, look for patterns. I believe that humans learn very well based on patterns. Literally do pattern matching with patterns you already know, or discover new patterns you didn’t know. For example, look at patterns in source code, these can be design patterns. Look at patterns in software in general - in architecture, in algorithms. Look at patterns in how the company works. Look at patterns in how people do things and work. Look at patterns in how the team you work in works. Look at patterns in the system - both software world and the real world. It will help you tons. For example, I know <a target="_blank" href="https://www.thoughtworks.com/profiles/u/unmesh-joshi">Unmesh Joshi</a> from ThoughtWorks, my previous company, who found patterns in distributed systems and shared it with the world through his blog posts and a book. It was an interesting thing indeed. And then there are many people who have done such things in the past, where they find patterns, and share it with the world</p>
<p>More references:</p>
<ul>
<li><p><a target="_blank" href="https://en.wikipedia.org/wiki/Software_design_pattern">https://en.wikipedia.org/wiki/Software_design_pattern</a></p>
</li>
<li><p>Patterns in Distributed Systems</p>
<ul>
<li><p><a target="_blank" href="https://martinfowler.com/books/patterns-distributed.html">https://martinfowler.com/books/patterns-distributed.html</a></p>
</li>
<li><p><a target="_blank" href="https://martinfowler.com/articles/patterns-of-distributed-systems/">https://martinfowler.com/articles/patterns-of-distributed-systems/</a></p>
</li>
<li><p><a target="_blank" href="https://www.thoughtworks.com/search?q=Unmesh%20Joshi">https://www.thoughtworks.com/search?q=Unmesh%20Joshi</a></p>
</li>
<li><p><a target="_blank" href="https://www.amazon.in/stores/author/B0CQC85KGR/allbooks">https://www.amazon.in/stores/author/B0CQC85KGR/allbooks</a></p>
</li>
<li><p><a target="_blank" href="https://www.amazon.in/Patterns-Distributed-Systems-Addison-Wesley-Signature/dp/0138221987">https://www.amazon.in/Patterns-Distributed-Systems-Addison-Wesley-Signature/dp/0138221987</a></p>
</li>
<li><p><a target="_blank" href="https://www.amazon.in/Patterns-Distributed-Systems-Addison-Wesley-Signature-ebook/dp/B0CCD3F8BH">https://www.amazon.in/Patterns-Distributed-Systems-Addison-Wesley-Signature-ebook/dp/B0CCD3F8BH</a></p>
</li>
<li><p><a target="_blank" href="https://www.amazon.in/Patterns-Distributed-Approach-Designing-Implementation/dp/9361590529">https://www.amazon.in/Patterns-Distributed-Approach-Designing-Implementation/dp/9361590529</a></p>
</li>
<li><p><a target="_blank" href="https://www.oreilly.com/library/view/patterns-of-distributed/9780138222246/">https://www.oreilly.com/library/view/patterns-of-distributed/9780138222246/</a></p>
</li>
<li><p><a target="_blank" href="https://www.thoughtworks.com/search?q=Unmesh%20Joshi">https://www.thoughtworks.com/search?q=Unmesh%20Joshi</a></p>
</li>
</ul>
</li>
<li><p><a target="_blank" href="https://en.wikipedia.org/wiki/Design_Patterns">https://en.wikipedia.org/wiki/Design_Patterns</a></p>
</li>
<li><p><a target="_blank" href="https://refactoring.guru/design-patterns">https://refactoring.guru/design-patterns</a></p>
</li>
</ul>
<blockquote>
<p>Focus on depth too and not just breadth. Especially focus on basics, core concepts</p>
</blockquote>
<p>As a newbie to the software industry, I was fascinated by the many things that were out there. I used to love trying out new things. And I still do. It’s great to try out as many new things as possible and learn from it. But it’s also important to stop and choose a few technologies and learn them in depth. Preferrably technologies that have been around for a long time and are still being used majorly and don’t seem to be going anywhere soon. These kind of technologies and concepts are usually core technologies and core that rarely get discarded or changed. For example, if you learn about CPUs, GPUs, RAM, etc, it will benefit you any day, until someone comes and changes CPU, RAM, GPU etc with something else, which is a rare thing and will take quite some time. So, focus on core concepts, basics. How do you know if it’s a core concept or basic? Just look at whatever you are working with and look one level deeper and understand how it works. You can go more levels deeper to understand how different things work. And you can keep doing this until you hit 0s and 1s (binary). So, there’s always something below the layer that you are working with. For example, you could be working with Programming Languages, Frameworks, Tools, Software Systems etc. But you may not always know what goes behind the scenes, like - how the programming language was built (language specification, compiler/interpreter, runtime etc), how the framework was built and how does it work behind the scenes when you use it’s features, how are the tools built and how do they work behind the scenes when you use it’s features, how do the software systems that you use work and how are they built and what are the core concepts that they are built on, for example, there are lot of people who don’t know what happens behind the scenes in a data system - say a stateful database, or a stateful message broker, and so on</p>
<blockquote>
<p>Do more things outside of your purview</p>
</blockquote>
<p>If you are in a project, you don’t have to do just your project. Try to see how you can help other people in other projects. Try to see if you can contribute to other projects and not just your project. Try to advocate the idea if it’s not appreciated in companies which say “Work just on your project”, kind of to say - “Stay in your bubble”. But it’s important that you don’t stay in your bubble always - that can lead to stagnation and also make you unaware of the things that go on in other projects and in the other teams and the company. It’s tricky because some companies are really huge - in terms of the product(s) and/ service(s) they provide, the number of people they have, the number of projects they have, the number of softwares they have, and use etc. But you can always do a bit more if you have got the time and energy and the will to do it, without affecting your Life and also career. Or else, you miss out on learning more by being more comfortable. This can be seen as a Fear Of Missing Out (FOMO) problem too. But yeah, it’s good to try more, do more, if not do everything especially in the case where there are too many things to look at or do that’s humanly impossible to do within a given small timeframe. Access etc to projects might be an issue in some places and companies. We talk about this later in a different section</p>
<blockquote>
<p>Take up more responsibilities</p>
</blockquote>
<p>I have noticed that the more we do, the more we are able to do. The less we do, the less we are able to do. Like, if you don’t use up your time to do anything productive and scroll social media for example, or similar distractions, you will end up doing a thing that sucks your time as a blackhole. And you will just use your time, which is probably a lot, to just get a small work done. Kinda based on the <a target="_blank" href="https://en.wikipedia.org/wiki/Parkinson's_law">Parkinson’s Law</a>, which is</p>
<blockquote>
<p>Work expands so as to fill the time available for its completion</p>
</blockquote>
<p>So, yeah, give yourself more to do, and you will most probably be able to do more, by managing your time and energy in a really well manner. But yeah, don’t end up burning yourself out. Get enough rest (but not too much :)). Try to balance work and life :)</p>
<p>Also, the more you do with your mind and body, the more you train your mind and body to do more, the more it will grow. Unlike the case where - when you do less with your mind and body, the worse it gets - it just rusts 😅 Unless you have a disability that disallows you to do something mentally or physically, I would recommend working on your mind and body by pushing it just the right amount - just right enough, to help it grow. This is based on science, so I recommend reading about this and not taking my word for it :)</p>
<blockquote>
<p>Be Smart First, Then Be Hardworking</p>
</blockquote>
<p>From a long time, I have usually always been the kind of person who postpones doing things, procrastinates, all in a very unuseful manner / unproductive manner. This lead to me preparing and studying for exams last minute and hard working, than doing something smarter and better, which is studying little by little everyday in a smart way and working hard on consistency, than crunching it all in one night or a week before the examination. The same can be applied to any situation - a work that you need to complete, a project that you need to work on and complete and so on and so forth. I recommend being smart and cutting down tasks into smaller tasks, smaller chunks, especially when the tasks are big tasks and then work on it in a smart way, estimate the time it would take and manage your time and energy while working on the task. Always aim for consistency in working over a time period than doing the work one time like last minute work.</p>
<p>Let me also tell you some of these things I did do in my career that helped me and that I liked / loved</p>
<blockquote>
<p>Contribute to Open Source Software</p>
</blockquote>
<p>I was fortunate enough to have been introduced to the concept of Open Source and Open Source Software and the different kinds of software from a long time, like, since high school. But in high school, I didn’t know much or understand much about the power of open source. Later in college, I realized how powerful open source and open source software is. By the way, there are open source hardwares, open source arts too, where people open source the blueprint of their hardware which could be the circuitry in case of an electronic or electrical appliance, or even the files required to print a 3D object in case of a 3D item, or even the source files required to modify and edit an art, say an image, or just provide images etc as open source. You can find some interesting examples of open source online. I also learned about open source licenses when it comes to open source software or any open source thing. It was interesting and in many ways complicated too. I was also fortunate enough to be able to contribute to open source software during my college time - to others projects and to build my own small projects. I was also fortunate enough to be in a team in my company that advocated open source software (to use it) and also advocated contributing to open source software.</p>
<p>But don’t let hope, fortune, chances and luck dictate what happens to you. Instead, steer clear and ensure you contribute to open source projects in case you are into it and it interests you and you have time for it. Or else keep it in mind for a time when you have free time and are bored and have nothing to do ;) Companies that allow people from different internal projects to contribute to other internal projects call the contribution as “internal open source” - so as to say that the projects (some of them or all of them) are internally open source - for anyone to read and even contribute in some cases, which is a pretty cool thing, because a lot of times you don’t even have access and authorization to read some of the code of some projects and software, and same is true for their documentation (architecture, use etc)</p>
<blockquote>
<p>Network with people</p>
</blockquote>
<p>I networked with people within the company and outside the company - in person and also on online platforms like social media, including professional social media platforms like LinkedIn. This helped me with finding new jobs, learning from new people, getting referrals and many more things. It opens up a whole lot of opportunities. So, if you haven’t done it yet, I would highly recommend networking more and more with people, regardless of your social skills (introvert etc) and interest, just to open up a new world for you and to open up a big can of opportunities with the help of people</p>
<blockquote>
<p>Write Blogs. Create Content in general</p>
</blockquote>
<p>A lot of people just consume content. They just read text, watch videos etc. It’s worse if the content is just random stuff, entertainment, social media doom scrolling etc. It’s great to read helpful books, blogs, articles etc that give you new perspectives, and same is true for audio, video and other forms of content. A great thing to note is - when you also create content, you create credibility by showing your content. This can be internal content within a company or organization or public content - for anyone in the world to see, in the form of physical (books, newspaper articles etc) or non-physical form (online blogs, ebooks, online articles, YouTube videos etc). Some content creation doesn’t take much time and effort. For example, recording audio or video is usually easy, assuming you are okay with not preparing much, because preparation and doing checks etc consumes a lot of time. I recommend doing things impromptu. Writing blogs and text in general can take up a lot of time and effort and you might end up reading and re-reading your content to look for mistakes etc. Try to be very casual about your content and just ensure you hit publish on that text (blog, article etc) or audio or video etc and try to do more and more of it, to the extent that you care more about content creation and less about criticism, peer reviews, and any other after effects of content creation, like no likes, no shares, no comments etc. Just do it for yourself first. It might come in handy for yourself to learn from your past self and also come in handy to show your hard work in content creation and also become a source of credibility and knowledge to show what you know, what you have done etc :)</p>
]]></content:encoded></item><item><title><![CDATA[Interview Process Experiences: Part 1]]></title><description><![CDATA[I have been recently interviewing to understand if there are any roles that can work for me.
I recently applied to https://www.observe.ai/ and they sent me questionairre to answer. Let me share the form, the questionairre and my answers so that peopl...]]></description><link>https://karuppiah.dev/interview-process-experiences-part-1</link><guid isPermaLink="true">https://karuppiah.dev/interview-process-experiences-part-1</guid><category><![CDATA[interview]]></category><category><![CDATA[interview questions]]></category><category><![CDATA[Interview tips]]></category><category><![CDATA[interview preparations]]></category><category><![CDATA[Interviews]]></category><category><![CDATA[interview process]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Tue, 06 May 2025 12:49:12 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/LQ1t-8Ms5PY/upload/faab84568f6c3804e28c18263e713525.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I have been recently interviewing to understand if there are any roles that can work for me.</p>
<p>I recently applied to <a target="_blank" href="https://www.observe.ai/">https://www.observe.ai/</a> and they sent me questionairre to answer. Let me share the form, the questionairre and my answers so that people can learn from it, or just read through, just for curiosity etc</p>
<hr />
<h1 id="heading-observeaihttpobserveai-tech-lead-infrastructure-engg-information-form"><a target="_blank" href="http://Observe.AI">Observe.AI</a> || Tech Lead - Infrastructure Engg Information Form</h1>
<p>Hello there! We are from the Talent Team at <a target="_blank" href="http://Observe.ai">Observe.ai</a> and we are elated at the prospect that you would love to apply and work with us, if selected. Requesting you to kindly update the details on this form and share so that we can take it to the next steps immediately.</p>
<p>Thanks,</p>
<p>OAI TA Team</p>
<hr />
<h2 id="heading-hello-please-tell-us-your-name">Hello! Please tell us your name *</h2>
<p>Karuppiah Natarajan</p>
<h2 id="heading-please-tell-us-about-your-role-aspiration">Please tell us about your Role Aspiration *</h2>
<p>I'm looking for a Site Reliability Engineer (SRE) kind of role, where I'm not just doing some basic stuff that I have been doing for years - like setting up CI/CD pipelines, setting up infrastructure systems</p>
<h2 id="heading-what-are-the-current-projects-that-you-are-working-on">What are the Current Projects that you are working on? *</h2>
<p>I'm currently learning Rust. Previously, when I was working at Ola, my last company, I was working on infrastructure - specifically on secrets management and also helping with running Kubernetes Clusters on AWS and on Ola Krutrim Cloud ( <a target="_blank" href="https://www.olakrutrim.com/cloud">https://www.olakrutrim.com/cloud</a> )</p>
<h2 id="heading-the-technical-stack-that-you-are-working-with">The Technical Stack that you are working with? *</h2>
<p>Most of my career - I have used Golang</p>
<h2 id="heading-please-tell-us-about-your-tech-lead-experience">Please tell us about your Tech Lead experience *</h2>
<p>I have been a tech lead in VMWare, where I was working with two other colleagues. It was a small team and we were working on release engineering tools and creating tooling for testing and verifying complex software (Tanzu) that helped spin up and run and manage Kubernetes Clusters. It was open source - Tanzu Community Edition ( <a target="_blank" href="https://github.com/vmware-tanzu/community-edition">https://github.com/vmware-tanzu/community-edition</a> , <a target="_blank" href="http://tanzucommunityedition.io/">http://tanzucommunityedition.io</a> )</p>
<p>After that, I joined Togai, an early startup back then, now acquired by Zuora. I maintained the whole of Togai infrastructure, which was on AWS. We managed the infrastructure using Terraform and Chef Cookbooks. I helped set up monitoring and alerting for some of their services. I tried to help with rising costs in monitoring. I also fixed memory leak issues. I helped them setup low cost CI/CD on GitHub Actions using our own infrastructure. Some blogs on those - <a target="_blank" href="https://karuppiah.dev/monitoring-nats-using-new-relic-instrumentation">https://karuppiah.dev/monitoring-nats-using-new-relic-instrumentation</a> , <a target="_blank" href="https://karuppiah.dev/upgrading-a-nats-cluster-in-production">https://karuppiah.dev/upgrading-a-nats-cluster-in-production</a> , <a target="_blank" href="https://karuppiah.dev/managing-github-organization-level-secrets-for-private-github-repositories-for-free-on-github-free-plan-using-terraform">https://karuppiah.dev/managing-github-organization-level-secrets-for-private-github-repositories-for-free-on-github-free-plan-using-terraform</a> , <a target="_blank" href="https://karuppiah.dev/managing-github-organization-level-secrets-for-private-github-repositories-for-free-on-github-free-plan-using-terraform">https://karuppiah.dev/managing-github-organization-level-secrets-for-private-github-repositories-for-free-on-github-free-plan-using-terraform</a> , <a target="_blank" href="https://karuppiah.dev/resizing-disk-increasing-the-size">https://karuppiah.dev/resizing-disk-increasing-the-size</a> , <a target="_blank" href="https://karuppiah.dev/managing-tens-of-thousands-of-messages-in-deadletter-queues-at-togai">https://karuppiah.dev/managing-tens-of-thousands-of-messages-in-deadletter-queues-at-togai</a> , <a target="_blank" href="https://karuppiah.dev/understanding-data-ingested-in-new-relic">https://karuppiah.dev/understanding-data-ingested-in-new-relic</a> , <a target="_blank" href="https://karuppiah.dev/understanding-postgresql-new-relic-on-host-integration">https://karuppiah.dev/understanding-postgresql-new-relic-on-host-integration</a> , <a target="_blank" href="https://karuppiah.dev/self-hosting-github-actions-runners">https://karuppiah.dev/self-hosting-github-actions-runners</a> , <a target="_blank" href="https://karuppiah.dev/debugging-and-fixing-a-memory-leak-in-a-nodejs-service-in-production">https://karuppiah.dev/debugging-and-fixing-a-memory-leak-in-a-nodejs-service-in-production</a> . Some more links - <a target="_blank" href="https://github.com/karuppiah7890/github-actions-self-hosted-runner-terraform">https://github.com/karuppiah7890/github-actions-self-hosted-runner-terraform</a> , <a target="_blank" href="https://github.com/karuppiah7890/github-actions-secrets-terraform">https://github.com/karuppiah7890/github-actions-secrets-terraform</a> , <a target="_blank" href="https://github.com/karuppiah7890/ec2-killer">https://github.com/karuppiah7890/ec2-killer</a> , <a target="_blank" href="https://github.com/karuppiah7890/ec2-github-runner">https://github.com/karuppiah7890/ec2-github-runner</a> , <a target="_blank" href="https://github.com/karuppiah7890/pg-query-killer">https://github.com/karuppiah7890/pg-query-killer</a> , <a target="_blank" href="https://github.com/karuppiah7890/sqs-alerter">https://github.com/karuppiah7890/sqs-alerter</a> , <a target="_blank" href="https://github.com/karuppiah7890/sqs-dump">https://github.com/karuppiah7890/sqs-dump</a> , <a target="_blank" href="https://github.com/karuppiah7890/publish-to-nats">https://github.com/karuppiah7890/publish-to-nats</a> , <a target="_blank" href="https://github.com/karuppiah7890/sqs-delete">https://github.com/karuppiah7890/sqs-delete</a> , <a target="_blank" href="https://github.com/karuppiah7890/sqs-to-nats">https://github.com/karuppiah7890/sqs-to-nats</a> , <a target="_blank" href="https://github.com/karuppiah7890/stripe-to-togai">https://github.com/karuppiah7890/stripe-to-togai</a> , <a target="_blank" href="https://github.com/karuppiah7890/nats-docker">https://github.com/karuppiah7890/nats-docker</a> , <a target="_blank" href="https://github.com/karuppiah7890/aws-tools">https://github.com/karuppiah7890/aws-tools</a> , <a target="_blank" href="https://github.com/karuppiah7890/postgres-alerter">https://github.com/karuppiah7890/postgres-alerter</a> , <a target="_blank" href="https://github.com/karuppiah7890/redis-ha-check">https://github.com/karuppiah7890/redis-ha-check</a> , <a target="_blank" href="https://github.com/karuppiah7890/service-alerter">https://github.com/karuppiah7890/service-alerter</a> , <a target="_blank" href="https://github.com/karuppiah7890/puppet-server">https://github.com/karuppiah7890/puppet-server</a> , <a target="_blank" href="https://github.com/karuppiah7890/urlcrawl">https://github.com/karuppiah7890/urlcrawl</a> , <a target="_blank" href="https://github.com/karuppiah7890/redis-alerter">https://github.com/karuppiah7890/redis-alerter</a></p>
<p>Later, I joined Ola Cabs, my last company. There I worked as a lead (SDE-3) in a team of 8, all SDE-2, with an existing lead (SDE-3). I helped mentor some of the juniors. I worked with Kubernetes Clusters, CI/CD pipelines, Developer Platform (Backstage , <a target="_blank" href="https://backstage.io/">https://backstage.io</a> ), Hashicorp Vault for Secrets Management. I built tools whenever necessary. I also helped Security Team with Security Compliance. Some links around these - <a target="_blank" href="https://karuppiah.dev/trying-out-squid-proxy">https://karuppiah.dev/trying-out-squid-proxy</a> , <a target="_blank" href="https://karuppiah.dev/trying-to-authenticate-with-vault-using-openid-connect-oidc-using-dex">https://karuppiah.dev/trying-to-authenticate-with-vault-using-openid-connect-oidc-using-dex</a> , <a target="_blank" href="https://karuppiah.dev/trying-to-authenticate-in-a-demo-application-using-openid-connect-oidc-using-keycloak">https://karuppiah.dev/trying-to-authenticate-in-a-demo-application-using-openid-connect-oidc-using-keycloak</a> , <a target="_blank" href="https://karuppiah.dev/trying-out-prometheus-operator">https://karuppiah.dev/trying-out-prometheus-operator</a> , <a target="_blank" href="https://karuppiah.dev/listing-aws-ec2-instance-information-with-aws-cli-v2">https://karuppiah.dev/listing-aws-ec2-instance-information-with-aws-cli-v2</a> , <a target="_blank" href="https://karuppiah.dev/trying-out-prometheus-operators-alertmanager-and-alertmanager-config-custom-resources">https://karuppiah.dev/trying-out-prometheus-operators-alertmanager-and-alertmanager-config-custom-resources</a> , <a target="_blank" href="https://karuppiah.dev/aws-api-invalidsignatureexception-signature-expired-error">https://karuppiah.dev/aws-api-invalidsignatureexception-signature-expired-error</a> , <a target="_blank" href="https://karuppiah.dev/shipping-cloudwatch-logs-to-s3">https://karuppiah.dev/shipping-cloudwatch-logs-to-s3</a> , <a target="_blank" href="https://karuppiah.dev/working-with-backstage-software-templates">https://karuppiah.dev/working-with-backstage-software-templates</a> , <a target="_blank" href="https://github.com/karuppiah7890/vault-helm-chart">https://github.com/karuppiah7890/vault-helm-chart</a> , <a target="_blank" href="https://github.com/karuppiah7890/vault-policy-cp">https://github.com/karuppiah7890/vault-policy-cp</a> , <a target="_blank" href="https://github.com/karuppiah7890/vault-policy-backup">https://github.com/karuppiah7890/vault-policy-backup</a> , <a target="_blank" href="https://github.com/karuppiah7890/vault-policy-restore">https://github.com/karuppiah7890/vault-policy-restore</a> , <a target="_blank" href="https://github.com/karuppiah7890/vault-kv-cp">https://github.com/karuppiah7890/vault-kv-cp</a> ,<a target="_blank" href="https://github.com/karuppiah7890/vault-kv-backup">https://github.com/karuppiah7890/vault-kv-backup</a> , <a target="_blank" href="https://github.com/karuppiah7890/vault-kv-restore">https://github.com/karuppiah7890/vault-kv-restore</a> , <a target="_blank" href="https://github.com/karuppiah7890/vault-k8s-auth-cp">https://github.com/karuppiah7890/vault-k8s-auth-cp</a> , <a target="_blank" href="https://github.com/karuppiah7890/vault-k8s-auth-backup">https://github.com/karuppiah7890/vault-k8s-auth-backup</a> , <a target="_blank" href="https://github.com/karuppiah7890/vault-tooling-contributions">https://github.com/karuppiah7890/vault-tooling-contributions</a> , <a target="_blank" href="https://github.com/karuppiah7890/alertmanager-helm-chart">https://github.com/karuppiah7890/alertmanager-helm-chart</a></p>
<h2 id="heading-please-tell-us-out-your-project-leading-experience">Please tell us out your Project Leading experience *</h2>
<p>I liked mentoring and teaching people in the project. I also learned a lot from the people in the project. I customized our ways of working based on the team's comfort. I always tried to ensure that the project people had enough psychological safety to talk about anything they want by being very frank and always addressing the elephant in the room</p>
<h2 id="heading-this-role-requires-you-to-be-proficient-with-programming-please-let-me-know-if-you-have-written-atleast-100-lines-of-code-in-the-recent-past-which-has-been-used-in-production-please-share-your-experience">This role requires you to be proficient with programming. Please let me know if you have written atleast 100 lines of code in the recent past which has been used in production? Please share your experience.  *</h2>
<p>Yes, I have. You can check all my open source tools code in my GitHub <a target="_blank" href="https://github.com/karuppiah7890">https://github.com/karuppiah7890</a></p>
<h2 id="heading-could-you-please-share-your-github-link-with-us">Could you please share your GitHub link with us ? *</h2>
<p><a target="_blank" href="https://github.com/karuppiah7890">https://github.com/karuppiah7890</a></p>
<h2 id="heading-do-you-have-mentoring-experience-please-share-more-details">Do you have Mentoring Experience? Please share more details *</h2>
<p>Yes, I have mentored quite some people - mostly juniors, teaching them about code, open source software</p>
<h2 id="heading-please-tell-about-any-recent-promotions-you-got">Please tell about any recent promotions you got? *</h2>
<p>I have gotten pay hikes in ThoughtWorks and VMWare. I got promoted in VMware from SDE-2 to SDE-3. I didn't stay in Togai or Ola for long to get pay hikes or promotions.</p>
<h2 id="heading-please-share-your-current-compensation-base-bonus-if-any-stocks">Please share your Current Compensation (Base + Bonus if any + Stocks) *</h2>
<p>Rs 45,00,000 per annum Fixed. No Bonus, No Stocks</p>
<h2 id="heading-please-share-your-compensation-expectations-cash-component-stocks">Please share your Compensation Expectations (Cash component + Stocks) *</h2>
<p>Rs 45,00,000 per annum Fixed. Not expecting stocks</p>
<h2 id="heading-whats-your-notice-period-do-mention-if-its-negotiable">What's your notice period? Do mention if its negotiable? *</h2>
<p>None / Nil. I can join immediately</p>
<h2 id="heading-are-you-actively-interviewing">Are you actively Interviewing?  *</h2>
<p>Yes</p>
<h2 id="heading-do-you-have-any-competitive-offer">Do you have any Competitive Offer? *</h2>
<p>No</p>
<h2 id="heading-bangalore-is-our-work-location-is-that-okay-with-you">Bangalore is our work location. Is that okay with you? *</h2>
<p>Yes</p>
<h2 id="heading-please-mention-if-you-have-any-questions-you-would-like-to-ask-and-we-would-call-you">Please mention if you have any questions you would like to ask and we would call you.  *</h2>
<p>I would like to the current team members, leadership (managers, reporting managers, hiring manager etc) and their LinkedIn profiles. I would also like talk to the current team and the leadership and understand where the company is at (what level, maturity etc) and where the company is at technically (services, infrastructure etc) and the kind of technical and non-technical problems the company is facing currently, and what is the current focus and what is the focus in the near future</p>
]]></content:encoded></item><item><title><![CDATA[May 4th 2025: Today I Learned (TIL)]]></title><description><![CDATA[Today I noticed the Report-To response header and learned that it’s a header that the browser consumes / uses and sends reports. For example, I noticed this while I was trying to apply to many companies on https://www.instahyre.com/ automatically usi...]]></description><link>https://karuppiah.dev/may-4th-2025-today-i-learned-til</link><guid isPermaLink="true">https://karuppiah.dev/may-4th-2025-today-i-learned-til</guid><category><![CDATA[TIL]]></category><category><![CDATA[todayilearned]]></category><category><![CDATA[content security policy]]></category><category><![CDATA[Content Security Policy (CSP)]]></category><category><![CDATA[Content Security Policy header]]></category><category><![CDATA[Security]]></category><category><![CDATA[securityawareness]]></category><category><![CDATA[Today I Learned]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Sun, 04 May 2025 14:34:31 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1746369191364/eecef0ab-a043-4cec-b771-abb998f84358.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Today I noticed the <code>Report-To</code> response header and learned that it’s a header that the browser consumes / uses and sends reports. For example, I noticed this while I was trying to apply to many companies on <a target="_blank" href="https://www.instahyre.com/">https://www.instahyre.com/</a> automatically using JavaScript. I opened up the list of jobs page at <a target="_blank" href="https://www.instahyre.com/candidate/opportunities/?matching=true">https://www.instahyre.com/candidate/opportunities/?matching=true</a> or the list of jobs page in a search, like <a target="_blank" href="https://www.instahyre.com/candidate/opportunities/?company_size=0&amp;job_functions=%2Fapi%2Fv1%2Fjob_function%2F8&amp;job_type=0&amp;search=true">https://www.instahyre.com/candidate/opportunities/?company_size=0&amp;job_functions=%2Fapi%2Fv1%2Fjob_function%2F8&amp;job_type=0&amp;search=true</a> or <a target="_blank" href="https://www.instahyre.com/candidate/opportunities/?company_size=0&amp;job_functions=%2Fapi%2Fv1%2Fjob_function%2F8,%2Fapi%2Fv1%2Fjob_function%2F10&amp;job_type=0&amp;search=true">https://www.instahyre.com/candidate/opportunities/?company_size=0&amp;job_functions=%2Fapi%2Fv1%2Fjob_function%2F8,%2Fapi%2Fv1%2Fjob_function%2F10&amp;job_type=0&amp;search=true</a></p>
<p>Initially I was clicking the <code>Apply</code> button a lot. At some point I realized I could use JavaScript, not exactly on my own maybe, instead I was bored clicking blindly and I recalled someone mention to use JavaScript to apply on LinkedIn and I was like “Hmm, why not just try that this time?”. So, I did try</p>
<p>I used the following JavaScript code and pasted it in my Google Chrome Browser Console window</p>
<blockquote>
<p><strong>Note: Weirdly it didn’t ask me to first type</strong> <code>allow pasting</code> <strong>or something of that sort, to prevent randomly just copy-pasting JavaScript code that I don’t know anything about, which could steal API tokens, cookies, personal information, credential information, using web requests and/ using the web page (front-end) and send it over to some remote server of the hacker</strong></p>
</blockquote>
<pre><code class="lang-javascript"><span class="hljs-comment">// without retrying for rate limiting errors</span>
<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">sleep</span>(<span class="hljs-params">ms</span>) </span>{
    <span class="hljs-keyword">return</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Promise</span>(<span class="hljs-function"><span class="hljs-params">resolve</span> =&gt;</span> <span class="hljs-built_in">setTimeout</span>(resolve, ms));
}

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">clickButtons</span>(<span class="hljs-params"></span>) </span>{
    <span class="hljs-keyword">let</span> start = <span class="hljs-built_in">Date</span>.now()
    <span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>;
    <span class="hljs-keyword">while</span> (<span class="hljs-literal">true</span>) {
        <span class="hljs-keyword">let</span> error = <span class="hljs-built_in">document</span>.querySelector(<span class="hljs-string">"#messages &gt; div &gt; div &gt; div &gt; div &gt; div"</span>)
        <span class="hljs-keyword">if</span> (error) {
            <span class="hljs-built_in">console</span>.log(<span class="hljs-string">"Found some error"</span>);
            <span class="hljs-keyword">let</span> closeErrorButton = <span class="hljs-built_in">document</span>.querySelector(<span class="hljs-string">"#messages &gt; div &gt; div &gt; div &gt; button"</span>);
            <span class="hljs-keyword">if</span> (closeErrorButton) {
                <span class="hljs-built_in">console</span>.log(<span class="hljs-string">"Closing the error"</span>);
                closeErrorButton.click();
            }
            <span class="hljs-built_in">console</span>.log(<span class="hljs-string">"Gonna exit now"</span>);
            <span class="hljs-keyword">let</span> end = <span class="hljs-built_in">Date</span>.now();
            <span class="hljs-built_in">console</span>.log(<span class="hljs-string">"time elapsed in milliseconds is: "</span>, end - start);
            <span class="hljs-keyword">break</span>;
        }

        <span class="hljs-keyword">await</span> sleep(<span class="hljs-number">500</span>);
        <span class="hljs-keyword">let</span> button = <span class="hljs-built_in">document</span>.querySelector(<span class="hljs-string">"#candidate-suggested-employers &gt; div &gt; div:nth-child(3) &gt; div &gt; div &gt; div.application-modal-wrap &gt; div.container &gt; div.row.bar-actions.ng-scope &gt; div.apply.ng-scope &gt; button"</span>);
        <span class="hljs-keyword">if</span> (button &amp;&amp; !button.disabled) {
            button.click()
            i++;
            <span class="hljs-built_in">console</span>.log(<span class="hljs-string">`clicked <span class="hljs-subst">${i}</span> times`</span>)
        }
    }
}

clickButtons();
</code></pre>
<p>When I did this automation, I noticed an error on the front-end at some point, which said something like “Something went wrong. Please try again later”. On checking, it was clear that Instahyre had implemented Rate Limiting on their backend systems, which is a great thing. I noticed that their <code>apply</code> endpoint was giving a <code>429</code> HTTP response, which translates to <code>Too Many Requests</code> and in the response, I noticed that it says some interesting things in the Response Headers, like <code>Retry-After</code>, <code>Report-To</code> etc</p>
<p>I used the <code>Retry-After</code> to understand when I can retry later. I think it was around 30 minutes, in seconds. I’m not sure what’s the retry-after time when this rate limiting is violated the second time, third time etc. Usually it’s an exponential backoff, at least that’s kind of ideal, where the retry-after time just keeps increasing, and that too increasing exponentially, so that the attacker (in many cases) is stopped from bombarding the backend systems with requests</p>
<p>You can quickly do a search of <code>retry-after header</code> and you will get details, for example - <a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Retry-After">https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Retry-After</a> and other search results</p>
<p>I generally trust the Mozilla Developer Network (MDN) documentation and website, and also trust the AI answers (which usually gives links to it’s source) and also Stack Overflow answers</p>
<p>I also noticed how there’s a <code>Report-To</code> header, which basically tells the client, in this case, our browser, to report this incident somewhere. Where? That’s mentioned in the <code>Report-To</code> header. I’m guessing that the browser does this automatically and that this implementation is done in the browser’s code, or else, the front-end code (JavaScript, TypeScript etc) has to do this I guess? Not sure</p>
<p>Looking at <a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Content-Security-Policy/report-to">https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Content-Security-Policy/report-to</a> , looks like it’s not implemented in all browsers as of this writing. You can also find some information at <a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Report-To">https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Report-To</a></p>
<p><code>Report-To</code> is actually part of CSP Directives / Headers. CSP is short for Content Security Policy. <a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Content-Security-Policy">https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Content-Security-Policy</a> and you can find directives, reporting directives etc in it</p>
<p>On reading a bit more, especially at <a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Report-To">https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Report-To</a> , you will notice that <code>Report-To</code> has been deprecated and not recommended and has been replaced by <code>Reporting-Endpoints</code> - <a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Reporting-Endpoints">https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Reporting-Endpoints</a></p>
<p>You can also read more about Content Security Policy and related things at</p>
<ul>
<li><p><a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/API/Reporting_API">https://developer.mozilla.org/en-US/docs/Web/API/Reporting_API</a></p>
</li>
<li><p><a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CSP">https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CSP</a></p>
<ul>
<li><a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CSP#violation_reporting">https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CSP#violation_reporting</a></li>
</ul>
</li>
<li><p><a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Content-Security-Policy">https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Content-Security-Policy</a></p>
<ul>
<li><p><a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Content-Security-Policy#directives">https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Content-Security-Policy#directives</a></p>
</li>
<li><p><a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Content-Security-Policy#reporting_directives">https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Content-Security-Policy#reporting_directives</a></p>
</li>
</ul>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Infrastructure Cost Optimization Project. Part 1]]></title><description><![CDATA[This is based on an experience working on a project to reduce infrastructure cost.
This is just Part 1, since I have just started on this project. Still very very new to me. So, recently, I spoke to the folks working at https://svaksha.in and they ha...]]></description><link>https://karuppiah.dev/infrastructure-cost-optimization-project-part-1</link><guid isPermaLink="true">https://karuppiah.dev/infrastructure-cost-optimization-project-part-1</guid><category><![CDATA[infrastructure]]></category><category><![CDATA[cost-optimisation]]></category><category><![CDATA[AWS]]></category><category><![CDATA[optimization]]></category><category><![CDATA[cost]]></category><category><![CDATA[CostSavings]]></category><category><![CDATA[Cost efficiency]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Wed, 09 Oct 2024 10:14:34 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/BRl69uNXr7g/upload/cf7f73dbbc2af3614d790f4282d520bc.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This is based on an experience working on a project to reduce infrastructure cost.</p>
<p>This is just Part 1, since I have just started on this project. Still very very new to me. So, recently, I spoke to the folks working at <a target="_blank" href="https://svaksha.in">https://svaksha.in</a> and they had a project that needed help with respect to reducing infrastructure cost apart from another thing around running the project in offline mode or in local, instead of in a cloud</p>
<p>The system is a healthcare system. And it has some web APIs, and a database and some cron jobs. I’m still trying to understand more. They use AWS to host the whole thing. They use AWS EC2 instances to host the system. They also use AWS SQS, AWS S3 for queues and object storage (images) respectively</p>
<p>Below is a draft proposal I had for them, which seemed generic enough to share, since it had mostly only technical details, nothing client specific</p>
<p>Any client specific information or any other information like names, have been anonymised for privacy, unless it’s my own name - Karuppiah Natarajan or KP.</p>
<hr />
<p>What this document intends to answer -  </p>
<p>What are we proposing? As a strategy and solution for the high infrastructure cost (bill) problem</p>
<p>Cost to solve the problem? That is, R&amp;D Cost - research and development to implement the solution. And what’s the Final cost of the Infrastructure after the solution is implemented.</p>
<p>How much time will it take to implement that? Human Resources - KP’s time, Alex’s time and/ Kyle’s time.</p>
<p>How many people will be implementing this solution? Most probably just one - Either Alex or Kyle. KP is just architecting or solutionizing for now  </p>
<hr />
<p>Current Scenario and Context:</p>
<p>Currently, we have one AWS EC2 instance for the whole system, that is, one server, which has the following things running  </p>
<ul>
<li><p>Database, with around 10GB of data. This is a PostgreSQL Database</p>
</li>
<li><p>API Service. This is always up and running. This is implemented in Python using the FastAPI web framework</p>
</li>
<li><p>Cron jobs. The cron job runs once every minute. The duration of a cron job is around a minute</p>
<ul>
<li><p>Questions</p>
<ul>
<li><p>Is there only one cron job here? Or multiple cron jobs? What is the cron schedule for each of the cron jobs if there are multiple. And are all the cron jobs doing the same / similar thing? That is, processing the image from S3?</p>
</li>
<li><p>The cron job kills itself after running for a minute? Even if it hasn’t finished / completed its processing?</p>
</li>
<li><p>What happens when a cron job fails? Fails gracefully and also if it fails non-gracefully</p>
</li>
<li><p>How is the cron job implemented? Is it some Python code? Python script? Does it use the API service? Does it use the Database?</p>
</li>
</ul>
</li>
<li><p>Need more detail here 👆⬆️🔼⏏️⏫⤴️☝️</p>
</li>
</ul>
</li>
</ul>
<p>We also have one AWS EC2 instance for the development server or what we call a dev server. This is used for development purposes - mainly testing, before deploying</p>
<p>So, overall, if we look at the current AWS EC2 instance usage and pricing -</p>
<p>We currently use one big EC2 instance in AWS, for running the whole system, which is of type <code>c5.xlarge</code> - the complete details of which can be found on AWS at different places, but an easier way to find the details in one place in a third party service is here - <a target="_blank" href="https://instances.vantage.sh/aws/ec2/c5.xlarge">https://instances.vantage.sh/aws/ec2/c5.xlarge</a></p>
<p>We also use one medium sized EC2 instance, for running the development server, which is of type <code>t3.medium</code> , the complete details of which can be found here - <a target="_blank" href="https://instances.vantage.sh/aws/ec2/t3.medium">https://instances.vantage.sh/aws/ec2/t3.medium</a></p>
<hr />
<p>Goals of the Solution:</p>
<ul>
<li><p>Bring minimal changes to the system - the software, the infrastructure. Preferably no changes to the software, except for changing configuration specific to infrastructure, especially the infrastructure changes</p>
</li>
<li><p>Keep It Simple. No complications. No over engineering. No over complications for sure.</p>
</li>
<li><p>Does not need too much efforts, time, energy from the maintainers of the software and infrastructure, to make the new changes, if any, to implement the solution</p>
</li>
<li><p>Does not need too much learning curve too, especially to learn something pretty new, from the maintainers of the software and infrastructure, to make the new changes, if any, to implement the solution</p>
</li>
</ul>
<p>Proposed Solution</p>
<ul>
<li><p>Use a separate VM (Virtual Machine) for running different things</p>
<ul>
<li><p>Separation of Concerns</p>
<ul>
<li><p>Since we are not running all the services and things inside containers (Linux containers) as isolated processes, it’s very possible that one process or multiple processes can hog up the resources of the Virtual Machine affecting the other processes, maybe very critical processes and hence affecting the whole system (everything as a whole). This is popularly called as the noisy neighbor problem</p>
</li>
<li><p>For example, if for some reason the cron jobs never complete and take up too much resources (CPU and RAM), then they can hinder important processes like the API service, the Database. Another example is, if any of the processes are using disk, say the cron job, and it uses up the disk completely, then it will cause a problem for other processes, since at least some disk space is needed to run the system - all the processes in the machine - including the Linux OS and Linux Kernel related processes. Also, it will cause problems for processes like the database processes which need disk to store and read data</p>
</li>
</ul>
</li>
</ul>
</li>
<li><p>Use small or very small VMs (Virtual Machines) and scale horizontally whenever possible</p>
<ul>
<li><p>Why? Rationale? A few reasons</p>
<ul>
<li><p>Public Clouds give Virtual Machines and Physical Machines only of specific standard size. Also, even chip manufacturers provide only certain standard size or units for their chips, like 2 cores, 4 cores, 8 cores, 2 GB RAM, 4 GB RAM, 8 GB RAM. One might not find some number in between like 5 cores CPU chip, 5 GB RAM chips very easily and it’s not a popular thing too. So, when we use very small or very small VMs, we can actually easily use them efficiently. If we need just 5GB RAM, instead of having to get a VM or physical machine of 6GB RAM or 8GB RAM from the cloud, we can try to use a mix of VMs or physical machines that have lesser RAM. This is assuming that we can horizontally scale - in which case, we just use something like, 1 4GB RAM VM and 1 1GB RAM VM. Or 5 1GB RAM VMs, or any such mix. We can mix and match. As long as our software can run on the VM and can horizontally scale, then all is great. For example, API services/ API servers can generally horizontally scale since they are stateless. Same is true for workloads like cron jobs - we do have to ensure that if a cron job runs on one machine, then it (the exact same cron job) doesn’t run again on another machine if it’s not needed.</p>
</li>
<li><p>We can scale with ease and just use the resources we need. There won’t be under utilization of resources. There are optimal resource utilization percentage numbers, like 60%, 70%, and we can maintain that easily with the help of using more VMs but smaller VMs</p>
</li>
</ul>
</li>
<li><p>Problems?</p>
<ul>
<li>Managing more and more VMs can be problematic. As more and more VMs come up into the system, we need some way to manage the VMs. Usually people use orchestrators here, like <a target="_blank" href="https://kubernetes.io/">Kubernetes</a> , <a target="_blank" href="https://www.nomadproject.io/">Hashicorp Nomad</a> (either Open Source or Enterprise) and similar.</li>
</ul>
</li>
</ul>
</li>
</ul>
<hr />
<p>Cost of the Final Infrastructure</p>
<p>Some estimates</p>
<ul>
<li><p>We spend $32.704 a month for the dev server. Check calculation <a target="_blank" href="https://instances.vantage.sh/aws/ec2/t3.medium?region=ap-south-1&amp;os=linux&amp;cost_duration=monthly&amp;reserved_term=Standard.noUpfront">here</a></p>
</li>
<li><p>We spend $124.10 a month for the main server. Check calculation <a target="_blank" href="https://instances.vantage.sh/aws/ec2/t3.medium?region=ap-south-1&amp;os=linux&amp;cost_duration=monthly&amp;reserved_term=Standard.noUpfront">here</a></p>
</li>
</ul>
<p>The proposal is to use at least 1 VM for each of the following</p>
<ul>
<li><p>Running API service</p>
</li>
<li><p>Running Database</p>
</li>
<li><p>Running Cron Jobs</p>
</li>
</ul>
<p>We have not included the “main site” here. We would have to consider it too if that’s needed too</p>
<p>Each VM size can be based on the workload we are running. Given we use PostgreSQL for our database, I think it can run with less resources too. A minimum good idea is to have 1 GB of RAM and 2 CPU cores - in the cloud world, it would be called vCPU - that is Virtual CPUs similar to Virtual Machines (VMs).</p>
<p>We can try running all our resources with just 1GB of RAM and 2 CPU cores. Maybe for PostgreSQL, if we are concerned, we can run it with 2GB of RAM and 4 CPU cores. But the thing is, currently there is 0 traffic, so, this 1GB of RAM and 2 CPU cores should be enough for now. Even with some traffic, I think PostgreSQL can handle it, or else we scale it up to 2GB of RAM and 4 CPU cores, and if even that is not enough, then we look at the traffic and reasoning before we scale up. Since that’s the ideal thing to do - to understand why resources are not enough - is there an actual need, or is there a problem in the system - like a bug, causing more resource usage, or some issue, causing more resource usage, for example, auto vacuum can be running in PostgreSQL and could be running again and again multiple times and fail etc too maybe for some reason and that can be the reason for the resource usage - this is just an example though.</p>
<p>So, with that in mind, we can say, we will need 3 servers</p>
<ul>
<li>All 3 will be <a target="_blank" href="https://instances.vantage.sh/aws/ec2/t3.micro?region=ap-south-1&amp;os=linux&amp;cost_duration=monthly&amp;reserved_term=Standard.noUpfront">t3.micro</a> instances. So, the cost will be around $8.176 x 3 = $24.5289</li>
</ul>
<p>Worst case scenario -</p>
<ul>
<li>All 3 will be <a target="_blank" href="https://instances.vantage.sh/aws/ec2/t3.small?region=ap-south-1&amp;os=linux&amp;cost_duration=monthly&amp;reserved_term=Standard.noUpfront">t3.small</a> instances. So, the cost will be around $16.352 x 3 = $49.056</li>
</ul>
<p>That’s almost double the cost. t3.small has 2GB RAM and 2 CPU cores. If that’s also not enough, we need to look at other options</p>
<p>Problems that I foresee</p>
<ul>
<li><p>If any of the workloads take (use) up too much resources even when the traffic is normal, we need to scale up and then understand why the system needs so much resources and does it make sense. If the workloads take (use) up too much resources when the traffic is high, then we need to again understand if it makes sense and why the system needs so much resources.</p>
</li>
<li><p>I feel like the Web service written in Python might need more resources even for one instance. Only when we run it with less resources, we will be able to tell  </p>
</li>
</ul>
<p>Assumptions</p>
<ul>
<li><p>I don’t think PostgreSQL will have a problem. It will be running fine, given it’s a battle tested software. As long as the configuration is right and it’s given enough resources (CPU, RAM, Disk) in a good enough for the workload it’s going to run, it will run fine. This is assuming that the SQL queries are written in an efficient way, or else we need to make the SQL queries efficient and also use PostgreSQL’s basic features like indexing and indexes (or indices) to make queries run fast for things like search</p>
</li>
<li><p>I don’t think the Linux OS and the Linux Kernel will require much resources to run, so, it should be fine to use 1GB RAM and 2 vCPU cores  </p>
</li>
</ul>
<p>Current set of problems or issues that is hindering us from finding the right / perfect / ideal / good enough solution and be able to compare the solutions:</p>
<ul>
<li><p>We don’t know exactly how much resource each process takes up - in terms of CPU, RAM (Memory), Network and Disk</p>
<ul>
<li>The current AWS EC2 instance web console / web UI shows monitoring but that’s at EC2 instance level and the thing is, or the problem is, one EC2 instance is running all the processes both our processes and the system (Linux OS, Linux Kernel) and hence we will not able to tell which process is using how much resources - we can only see high level data at EC2 instance level and that too, for some reason, we can see only CPU usage in percentage, no RAM. And there’s some network usage graphs too</li>
</ul>
</li>
</ul>
<p>These problems will be there in the future too, if we don’t solve them. How do we solve them? Have monitoring and observability setup. But it’s a bit of an overkill for a small scale setup like this. But it’s key and important to be able to understand what’s going on. At Least something basic. Maybe Svaksha can implement and run a central monitoring and observability setup for all their clients and provide it at a low cost. </p>
<hr />
<p>Goals for an Ideal Future</p>
<ul>
<li><p>Very reduced infrastructure cost</p>
</li>
<li><p>Performant - in terms of time, speed, resources required to run</p>
<ul>
<li><p>Requires less compute - less CPU and RAM too</p>
</li>
<li><p>Requires less storage - less RAM, less or no disk</p>
</li>
<li><p>Requires less or no network</p>
</li>
</ul>
</li>
<li><p>Easy to modify system - both software and infrastructure</p>
</li>
<li><p>Easy to completely destroy the whole system and bring it back up again whenever needed. Especially the infrastructure and the infrastructure related setup and any software too</p>
</li>
<li><p>Easy to test the system</p>
</li>
<li><p>Easy to develop the system</p>
</li>
<li><p>Ability to Reuse the setup elsewhere and ease of reuse too</p>
<ul>
<li><p>Local / Running Offline</p>
</li>
<li><p>Other cloud environments where cloud costs are cheaper</p>
</li>
</ul>
</li>
</ul>
<p>Futuristic Ideas for the Future, for an Ideal Future, based on the above goals:</p>
<ul>
<li><p>Different Levels of Improvement</p>
<ul>
<li><p>Software</p>
<ul>
<li><p>Use efficient algorithms</p>
</li>
<li><p>Use efficient software - libraries, frameworks, tools, and systems</p>
<ul>
<li>Libraries, frameworks, tools and systems that implement the things you need, say algorithms and any processing, in an efficient manner, cost effective manner, with less or least resources</li>
</ul>
</li>
<li><p>Use efficient programming languages - with efficient run time - for better performance with less resource usage. Basically, better performance to price ratio</p>
</li>
<li><p>Understand the software’s cost (if any, in terms of money, effort etc) and performance using benchmarks and then use it</p>
</li>
</ul>
</li>
<li><p>Hardware</p>
<ul>
<li><p>Use efficient hardware overall</p>
<ul>
<li><p>RAM</p>
<ul>
<li><p>Research and use RAMs that have better performance. There are many manufacturers of RAMs out there. And different types of RAMs too I think</p>
</li>
<li><p>Understand the RAM chip’s cost and performance using benchmarks and then use it</p>
</li>
</ul>
</li>
<li><p>Disk</p>
<ul>
<li><p>Use Disks with better input/output performance - read/write performance - basically, IOPS - Input Output Operations Per Second</p>
</li>
<li><p>Use newer technology disks whenever possible, especially for disk heavy software - like databases. So, consider using Solid State Drives (SSDs) instead of Hard Disk Drives (HDDs). And there are different variants and versions and types in these in the market and in the cloud. So, choose appropriately - based on cost and requirement/need. SSDs are significantly costlier than HDDs</p>
</li>
<li><p>Understand the disk’s cost and performance using benchmarks and then use it</p>
</li>
</ul>
</li>
<li><p>CPU</p>
<ul>
<li><p>Try using newer architectures and chips</p>
<ul>
<li>For example, try using arm64 which is ARM architecture, ARM chip, 64 bit. Apparently it has better performance with less resource usage. Basically, better performance to price ratio. Try the same with even AMD company chips in general if you are using the popular Intel chips. For example, for the same amd64 architecture, one can get chips from AMD company or from Intel company, though the CPU architecture is amd64, amd for the naming and 64 for 64 bit</li>
</ul>
</li>
<li><p>Understand the CPU’s (CPU chip’s) cost and performance using benchmarks and then use it</p>
</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
<li><p>Network</p>
<ul>
<li><p>Use efficient topologies</p>
<ul>
<li><p>Less complexity preferably, for better understanding</p>
</li>
<li><p>Less hops as much as possible, to avoid delays, latency</p>
</li>
</ul>
</li>
<li><p>Use efficient network protocols</p>
</li>
<li><p>Use network efficiently - requiring less bandwidth whenever possible as network calls take time depending on the speed of the network (bandwidth) and also cost a lot in the cloud in some specific cases</p>
</li>
<li><p>Use network cost effectively - as network calls / network usage cost a lot in the cloud in some specific cases. And even if it’s local machine, it can get costly depending on the network being used and the network provider</p>
</li>
<li><p>Understand the network cost and performance using benchmarks and then use it</p>
</li>
</ul>
</li>
</ul>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Interesting ideas to use AI. Part 1]]></title><description><![CDATA[I see interesting ideas to use AI from Gemini AI
Some examples with appropriate prompts with changeable / editable parameters:

Brainstorm presentation ideas about a topic

Generate 10 interesting ideas for a presentation about biology. Provide your ...]]></description><link>https://karuppiah.dev/interesting-ideas-to-use-ai-part-1</link><guid isPermaLink="true">https://karuppiah.dev/interesting-ideas-to-use-ai-part-1</guid><category><![CDATA[Artificial Intelligence]]></category><category><![CDATA[AI]]></category><category><![CDATA[Google Gemini AI]]></category><dc:creator><![CDATA[Karuppiah Natarajan]]></dc:creator><pubDate>Wed, 09 Oct 2024 07:43:09 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1728459756778/1a4f49bc-9e56-48e4-b06f-e0fedfbfb416.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I see interesting ideas to use AI from Gemini AI</p>
<p>Some examples with appropriate prompts with changeable / editable parameters:</p>
<ul>
<li><p>Brainstorm presentation ideas about a topic</p>
<ul>
<li>Generate <code>10</code> interesting <code>ideas</code> for a presentation about <code>biology</code>. Provide your response in the form of a <code>list</code>.</li>
</ul>
</li>
<li><p>Find hotels in Phuket for a week in March and suggest a packing list</p>
</li>
<li><p>I'm writing a short story set in the Elizabethan era. @OpenStax provide some context on daily life and customs</p>
</li>
<li><p>I'm a huge soccer fan. Test my knowledge with a quiz, focusing on European teams</p>
</li>
<li><p>Explain the impact of globalization</p>
<ul>
<li>@OpenStax explain the impact of globalization on developing countries</li>
</ul>
</li>
<li><p>Give me ideas for what to do with what's in this image?</p>
<ul>
<li>What is <code>noteworthy</code> about this image? Give me ideas for how I can use its elements <code>creatively</code></li>
</ul>
</li>
<li><p>Help me train for a race next month</p>
<ul>
<li>Help me plan for a 5K run, I have 1 month to train</li>
</ul>
</li>
<li><p>Write lyrics to a song about heartbreak</p>
<ul>
<li>Write some lyrics for a heartbreak song titled ‘Lovesick'</li>
</ul>
</li>
</ul>
]]></content:encoded></item></channel></rss>