6.1. Builtin Essentials

All builtin rules defined in pyke, are functions that schedule jobs to be run by the internal job scheduler in build time.

Before you can check the function reference, you need to know that ALL builtin functions accept, by convention, some optional keyword arguments. These keyword arguments give you more control and flexibility when you write pyke scripts, allowing you to specify extra dependencies, notifying about extra files being created by some commands, and specifying under which circumstances you want any command to run.

6.1.1. Including builtins in your code

The recommended way of using the builtins functions in your build scripts is to import everything from it, usually at the top of the master script. One time in the master script is enough, and all the functions will be available for all the other pyke build scripts.

Place the following line at the top of the master.pyke file and you are done!

from pyke.builtins import *

6.1.2. Working with paths

Apart from the rules, you are always going to be working with paths; specifying which files will be created, which ones will be removed, which are dependencies, ... this is why understanding the convention used for paths is essential. It is very simple to understand, but nontheless still essential.

All paths passed to builtin functions should be absolute paths or paths relative to the pyke build file.

Let me be repeat myself over relative paths. When this document talks about relative paths it means that paths are relative to pyke’s build file, even if you include a python module you have created, and in there there is a function that calls a rule over a relative path, the path will be relative to the pyke build function being executed, not your module.

Now, since for every pyke build there is one, and only one master file, there is a way to reference paths relative to the master.pyke file, by using masterpath() function.

Let me show you a simple example to try to clarify these concepts, you’ll see that is really easy.

Imagine the following folder structure:

+ /home/user/project/
  + component/
    - build.pyke
  - master.pyke

Let’s say this is the master.pyke:

include ('component/build.pyke')

# absolute location
mkdir ('/home/user/project/output/')

# relative to this build file
mkdir ('output/')

# relative to master.pyke
mkdir (masterpath ('output/'))

And this is build.pyke:

# absolute location
mkdir ('/home/user/project/output/')

# relative to build.pyke
mkdir ('../output/')

# relative to master.pyke
mkdir (masterpath ('output/'))

You can easily see that we are always referencing the same location in both examples, using absolute and relative paths.

Take a look at Convenient path functions section to see how to use explicit functions to reference the files you want.

6.1.3. Optional keyword arguments

For convenience there are optional keyword arguments that allow anyone do extra stuff. Optional keywords start with a underscore symbol (_) by convention, to avoid colliding with function arguments.

Builtin functions accept the following extra parameters:

  • _needs is a list of extra files or folder dependencies that the job needs to run, either because these dependencies cannot be deduced by the job rule itself, or because you want to enforce a job to wait for another job.

  • _creates | _deletes | _updates is a list of extra files that will be created, deleted or overwritten after the job runs, but which, for some reason, cannot be deduced from the command itself, or that somehow it is convenient for your build because you want to force some tasks to wait for other tasks.

    • _creates should be used to specify one or more files that will be created or overwritten after the job is executed.
    • _deletes should be used to specify one or more files that will be removed after the job is executed.
    • _updates should be used to specify one or more files that will be used both as input and as output. They should exist before and after the command is executed, and will be used as an input and as an output for the command. This is not the common case, but imagine for example that for some reason you want to get a file generated by another command, and do a text replacement on it (in this case, though, doing the replacement to another file would be a better idea, but still, you can do it if you need).
  • _if is used to specify the build policy that is going to be used. In another words, this string specifies under which conditions the function will be executed. _if conditions

Following you can see the list of valid condition values:

  • False

    jobs will never be executed

  • True

    jobs will always be executed

  • “empty”

    for dirs jobs will run when dirs are empty (have no files inside)

    for files jobs will run when files are empty (file_size = 0)

  • “updated” (default for most rules)

    jobs will run when any input is newer than any output or when any output is missing

  • “exists”

    jobs will be executed if any of the targets exists

  • “missing”

    jobs will be executed if any target is missing

  • callback(inputs, outputs)

    a callback function that will be called at runtime, so it can decide whether internal jobs should be executed or not. This function receives a list of all input files (and dependencies) and a list of all output files, in no special order. The function should return True if the jobs need to be executed and False otherwise.

Example using basic parameters:

from pyke.builtins import *

# "tmp_folder" will be created if it does not exists yet
mkdir ("tmp_folder")

# "tmp_folder" will be created everytime you run pyke
mkdir ("tmp_folder", _if = True)

# "tmp_folder" will NEVER be created
mkdir ("tmp_folder", _if = False)

# "tmp_folder" will be created only when it does not exist
mkdir ("tmp_folder", _if = "missing")

# "tmp_folder" will be removed only if it exists
rmdir ("tmp_folder", _if = "exists")

Example using a callback:

from pyke.builtins import *
import zipfile

def is_zipped (inputs, outputs):
  """ Returns True when ANY of the input files are compressed and False otherwise """
  for infile in inputs:
    if zipfile.is_zipfile (infile):
      return True
  return False

# ... some rules ...

# download a file and uncompress it when the file is zipped
download (source_url, destination_file)
uncompress (destination_file, "out", _if = is_zipped)

In the example above, destionation_file will only be decompressed when is_zipped returns True, which in this case means, that destination_file has been compressed using ZIP format.

6.1.4. Run commands serially


Along with serialEnd() defines a block where jobs will be executed serially, no matter how many threads are being used to run things in parallel. All jobs created within these two functions will run one after another.

Take the following example:

>>> shell ('echo -n 1')
>>> shell ('echo -n 2')
>>> shell ('echo -n 3')

Since commands have no dependencies it is not guaranteed that ‘123’ is written. Depending on the number of threads it might end up writting ‘132’ or ‘231’ instead.

The only way to guarantee that the commands are executed one after another is either running the script using only 1 thread, or placing the commands inside a serialStart() and serialEnd() block:

>>> serialStart ()
>>> shell ('echo -n 1')
>>> shell ('echo -n 2')
>>> shell ('echo -n 3')
>>> serialEnd ()

This construct would always produce ‘123’


All jobs defined after this point will run in parallel again after calling this function.

Please note that all pyke scripts start running in parallel by default.