Guild AI: Better Models by Measuring

5 min readOct 14, 2020

Experiments in Guild AI, screenshot by author

Guild AI is a lightweight, open source tool used to run, capture, and compare machine learning experiments.

To create an experiment, run your script with Guild AI from a command prompt:

$ guild run train.py

Guild captures a full record of your experiment.

Source code
Hyperparameters (learning rate, batch size, etc.)
Script output
Script results (accuracy, AUC, loss, etc.)
Generated files (saved models, data sets, plots, etc.)

Guild lets you compare results with a variety of tools including TensorBoard, HiPlot, file diff, and Guild View. Explore and compare experiments to answer questions about what ran and how it performed.

Compare runs using parallel coordinates view, screenshot by author

Why track experiments?

If you don’t record results, you depend on memory and intuition. This gets you only so far. Without systematic measurement, it’s hard to answer important questions.

Which hyperparameters yield the best results for a given approach?
Did my latest change break something unexpectedly?
What caused that sudden improvements in performance last week?
How do our AUC-ROC curves compare over time?
Did my colleague get the same result? What was different? What can we learn from those differences?

The list goes on.

If you don’t measure, you miss opportunities to learn from your work and make better decisions.

But experiment tracking is a pain

Experiment tracking can be a pain, depending on your approach.

Spreadsheets. Copy-and-paste is tedious and error prone. You miss crucial details like source code, script output, and files generated by your script.
Roll-your-own experiment tracking. It seems simple: just copy results to a timestamped directory. This misconception leads many to develop their own experiment tracking systems. Do you want to spend time building tools, or using them?
Paid services. If you have to create an account, provide credit card information, or otherwise submit data to a corporation — just to capture an experiment — you’re less likely to capture an experiment.
Modifications to code. Experiment tracking tools often mandate invasive changes to code. Even simple operations like writing a file need specialized libraries. Each code change is a distraction from your work. Worse, it ties your scripts to non-standard frameworks.
Required databases, agents, file systems, etc. Many experiment tracking tools use external systems that you install, configure and maintain. Even if you have the expertise, this work is a distraction from the goal of building better models.

Despite your best intentions to track experiments, you may conclude that it’s not work the pain.

Experiment tracking is NOT a pain

Guild AI is different.

Guild does not mandate changes to your code — your code runs as-is without ties to a framework.
Guild does not use databases, exotic file systems, or back-end services.
Guild never asks you to create an account.
Guild is 100% open source and platform independent. It comes free of charge and without strings attached.

If experiment tracking is fast and easy — as it is with Guild — you’re more likely to do it.

Guild AI design philosophy

Guild AI adheres to the Unix philosophy. It’s designed to be simple and lightweight without compromising features.

Guild saves each run in a unique directory. There are no databases or exotic file systems. External systems are a pain to setup and maintain. You don’t need them so Guild doesn’t use them.
Guild captures inputs and outputs through standard process interfaces. It sets hyperparameters using command line arguments, environment variables, and specialized support for Python modules. Guild gets output from standard IO and industry standard summary files. This keeps your code 100% independent of the experiment tracking framework.
Guild saves everything you need with each run. This includes source code, system properties, software libraries, and a host of metadata. It’s hard to appreciate the breadth of saved information until you need it. Luckily Guild handles this for you.

If you’re tempted to write your own experiment tracking code, try Guild first. You get the same straightforward approach but with a full-featured, proven tool.

Hidden benefits of experiment tracking

When you use Guild AI to run experiments, you get powerful extras.

Automated pipelines. Output from a run can be used as input to other runs to create automated pipelines.
Hyperparameter search. Automate grid search, random search, and Bayesian optimization.
Remote runs. Run on remote systems the same way you run locally.
Backup and restore. Copy runs to cloud services or on-prem backup servers.
Collaboration. Share results with colleagues and collect their feedback on each run.

Open source and platform independent

Guild AI is 100% open source software. You can use Guild in your projects without limiting others.

Guild AI is platform independent. When you use Guild, you don’t connect to a back-end system. You don’t share your data. You’re free to run your code how and where you want without influence from a corporate interest.

These are important factors when considering the cost of experiment management. Independent, open source software is not only free in terms of money. It’s free in terms of autonomy — you make the best decisions for your project and your code.

Does Guild AI use magic?

Guild AI supports some magic to keep things simple for new users. This support is limited. To be explicit and control what Guild does, add a Guild file to your project.

Consider an example. Guild auto-detects project source code by looking for typical source code files. What if Guild gets this wrong? Use a Guild file to control what’s copied:

# guild.yml (located in project directory)train:                   # operation definition
  sourcecode:            # source code config
    - exclude: data      # don't copy files from data dir

For a complete description of Guild’s configuration support, see Guild File Reference.

Better models by measuring

No matter how formidable your intuition, data science is inherently experimental. You don’t know whether a decision is helpful or harmful until you test it and observe an outcome.

Guild AI lowers the barrier to formal measurement in data science. Run your script with Guild to capture a full record of the operation. With each measurement you collect more evidence to inform your next steps — and to build better models.

To learn more about Guild AI, visit https://guild.ai.