teamcity on demand agents and patching windows
TeamCity On Demand / Cloud agents enable you to reduce build costs by shutting down instances when not in use, as well as freeing up costly TeamCity licenses for other agents.
After implementing TeamCity On Demand agents we have seen a significant reduction in build agent hosting costs, as they are now only being started when needed and then automatically being shut down once the build has completed.
Additionally, since our teams are around the world, we can use a lower number of TeamCity licenses than we have agents available for teams. Licenses are automatically assigned to agents only when they are running.
This allows us to spin down the agents of teams which won't be actively building at the time, with the ability to automatically spin up agents to accomodate the late-night coder on the other side of the world.
However one issue that must be kept in mind with On Demand agents is scheduled patching windows.
We have scheduled patching on a weekly basis, and since the On Demand agents were often stopped during these patch windows, I would come in the next morning to a notification of hundreds of servers which I had to do follow up patches for.
To resolve this, I created a script which automatically starts all TeamCity On Demand agents and keeps them running through the patching window, and then stops them after the patches have completed.
While I unfortunately cannot share this script, I can share some some caveats I found along the way. This should help if you plan on writing a similar script of your own.
last updated 2019-06-05T22:58:33-0700
If you found this post you probably already know this by now - you can't simply "start" all of the instances, either in TeamCity or via your cloud provider's CLI. If you have a idle time set in TeamCity, it will stop the server after this idle time regardless of whether it was "manually" started or not.
The TeamCity API endpoints responsible for these actions are mainly
PUT endpoints. This means that there isn't a simply "start all instances for X time" endpoint.
PUT all the Cloud Profiles for each individual profile (including cloud provider credentials) - this means you must first
GET the provider details to store before starting, and then
PUT them back once complete.
For security you will not receive the cloud provider credentials when requesting the data, but you are required to provide them when you
PUT back the configuration once complete.
Unlike conventional agents, On Demand agents are associated directly to a specific project in a "Project Pool". Furthermore, while agents can be assigned to sub-projects, from an API perspective, your modifications will all be on the "Parent Project".
If you have significantly less licenses than agents, you may not be able to start them properly. To migitate this you will either need to disable the agents temporarily, or do rolling patches over a set period of time.