File Download Support
- BitBake's fetch and
- fetch2 modules support downloading
- files.
- This chapter provides an overview of the fetching process
- and also presents sections on each of the fetchers BitBake
- supports.
-
- The original fetch code, for all
- practical purposes, has been replaced by
- fetch2 code.
- Consequently, the information in this chapter does not
- apply to fetch.
-
+ BitBake's fetch module is a standalone piece of library code
+ that deals with the intricacies of downloading source code
+ and files from remote systems.
+ Fetching source code is one of the corner stones of building software.
+ As such, this module forms an important part of BitBake.
-
- Overview
+
+ The current fetch module is called "fetch2" and refers to the
+ fact that it is the second major version of the API.
+ The original version is obsolete and removed from the codebase.
+ Thus, in all cases, "fetch" refers to "fetch2" in this
+ manual.
+
+
+
+ The Download (Fetch)
- When BitBake starts to execute, the very first thing
- it does is to fetch the source files needed.
- This section overviews the process.
+ BitBake takes several steps when fetching source code or files.
+ The fetcher codebase deals with two distinct processes in order:
+ obtaining the files from somewhere (cached or otherwise)
+ and then unpacking those files into a specific location and
+ perhaps in a specific way.
+ Getting and unpacking the files is often optionally followed
+ by patching.
+ Patching, however, is not covered by the fetch.
- When BitBake goes looking for source files, it follows a search
- order:
-
+ The code to execute the first part of this process, a fetch,
+ looks something like the following:
+
+ src_uri = (d.getVar('SRC_URI', True) or "").split()
+ fetcher = bb.fetch2.Fetch(src_uri, d)
+ fetcher.download()
+
+ This code sets up an instance of the fetch module.
+ The instance uses a space-separated list of URLs from the
+ SRC_URI
+ variable and then calls the download
+ method to download the files.
+
+
+
+ The instance of the fetch module is usually followed by:
+
+ rootdir = l.getVar('WORKDIR', True)
+ fetcher.unpack(rootdir)
+
+ This code unpacks the downloaded files to the
+ specified by WORKDIR.
+
+ For convenience, the naming in these examples matches
+ the variables used by OpenEmbedded.
+
+ The SRC_URI and WORKDIR
+ variables are not coded into the fetcher.
+ They variables can (and are) called with different variable names.
+ In OpenEmbedded for example, the shared state (sstate) code uses
+ the fetch module to fetch the sstate files.
+
+
+
+ When the download() method is called,
+ BitBake tries to fulfill the URLs by looking for source files
+ in a specific search order:
+ Pre-mirror Sites:
- BitBake first uses pre-mirrors to try and find source
- files.
+ BitBake first uses pre-mirrors to try and find source files.
These locations are defined using the
PREMIRRORS
variable.
Source URI:
- If pre-mirrors fail, BitBake uses
- SRC_URI.
+ If pre-mirrors fail, BitBake uses the original URL (e.g from
+ SRC_URI).
Mirror Sites:
- If fetch failures occur using SRC_URI,
- BitBake next uses mirror location as defined by the
+ If fetch failures occur, BitBake next uses mirror location as
+ defined by the
MIRRORS
variable.
-
+
+
+
+
+ For each URL passed to the fetcher, the fetcher
+ calls the submodule that handles that particular URL type.
+ This behavior can be the source of some confusion when you
+ are providing URLs for the SRC_URI
+ variable.
+ Consider the following two URLs:
+
+ http://git.yoctoproject.org/git/poky;protocol=git
+ git://git.yoctoproject.org/git/poky;protocol=http
+
+ In the former case, the URL is passed to the
+ wget fetcher, which does not
+ understand "git".
+ Therefore, the latter case is the correct form since the
+ Git fetcher does know how to use HTTP as a transport.
- Because cross-URLs are supported, it is possible to mirror
- a Git repository on an HTTP server as a tarball.
Here are some examples that show commonly used mirror
definitions:
@@ -74,19 +129,29 @@
http://.*/.* http://somemirror.org/sources/ \n \
https://.*/.* http://somemirror.org/sources/ \n"
+ It is useful to note that BitBake supports
+ cross-URLs.
+ It is possible to mirror a Git repository on an HTTP
+ server as a tarball.
+ This is what the git:// mapping in
+ the previous example does.
- Any source files that are not local (i.e. downloaded from
- the Internet) are placed into the download directory,
- which is specified by
- DL_DIR.
+ Since network accesses are slow, Bitbake maintains a
+ cache of files downloaded from the network.
+ Any source files that are not local (i.e.
+ downloaded from the Internet) are placed into the download
+ directory, which is specified by the
+ DL_DIR
+ variable.
+ File integrity is of key importance for reproducing builds.
For non-local archive downloads, the fetcher code can verify
- sha256 and md5 checksums to ensure
- the archives have been downloaded correctly.
+ sha256 and md5 checksums to ensure the archives have been
+ downloaded correctly.
You can specify these checksums by using the
SRC_URI variable with the appropriate
varflags as follows:
@@ -97,18 +162,87 @@
You can also specify the checksums as parameters on the
SRC_URI as shown below:
- SRC_URI="http://example.com/foobar.tar.bz2;md5sum=4a8e0f237e961fd7785d19d07fdb994d"
+ SRC_URI = "http://example.com/foobar.tar.bz2;md5sum=4a8e0f237e961fd7785d19d07f
+db994d"
+ If multiple URIs exist, you can specify the checksums either
+ directly as in the previous example, or you can name the URLs.
+ The following syntax shows how you name the URIs:
+
+ SRC_URI = "http://example.com/foobar.tar.bz2;name=foo"
+ SRC_URI[foo.md5sum] = 4a8e0f237e961fd7785d19d07fdb994d
+
+ After a file has been downloaded and has had its checksum checked,
+ a ".done" stamp is placed in DL_DIR.
+ BitBake uses this stamp during subsequent builds to avoid
+ downloading or comparing a checksum for the file again.
+
+ It is assumed that local storage is safe from data corruption.
+ If this were not the case, there would be bigger issues to worry about.
+
+
+
+
If
BB_STRICT_CHECKSUM
- is set, any download without a checksum triggers an error message.
- In cases where multiple files are listed using
- SRC_URI, the name parameter is used
- assign names to the URLs and these are then specified
- in the checksums using the following form:
-
- SRC_URI[name.sha256sum]
-
+ is set, any download without a checksum triggers an
+ error message.
+ The
+ BB_NO_NETWORK
+ variable can be used to make any attempted network access a fatal
+ error, which is useful for checking that mirrors are complete
+ as well as other things.
+
+
+
+
+ The Unpack
+
+
+ The unpack process usually immediately follows the download.
+ For all URLs except Git URLs, BitBake uses the common
+ unpack method.
+
+
+
+ A number of parameters exist that you can specify within the
+ URL to govern the behavior of the unpack stage:
+
+ unpack:
+ Controls whether the URL components are unpacked.
+ If set to "1", which is the default, the components
+ are unpacked.
+ If set to "0", the unpack stage leaves the file alone.
+ This parameter is useful when you want an archive to be
+ copied in and not be unpacked.
+
+ dos:
+ Applies to .zip and
+ .jar files and specifies whether to
+ use DOS line ending conversion on text files.
+
+ basepath:
+ Instructs the unpack stage to strip the specified
+ directories from the source path when unpacking.
+
+ subdir:
+ Unpacks the specific URL to the specified subdirectory
+ within the root directory.
+
+
+ The unpack call automatically decompresses and extracts files
+ with ".Z", ".z", ".gz", ".xz", ".zip", ".jar", ".ipk", ".rpm".
+ ".srpm", ".deb" and ".bz2" extensions as well as various combinations
+ of tarball extensions.
+
+
+
+ As mentioned, the Git fetcher has its own unpack method that
+ is optimized to work with Git trees.
+ Basically, this method works by cloning the tree into the final
+ directory.
+ The process is completed using references so that there is
+ only one central copy of the Git metadata needed.
@@ -116,47 +250,45 @@
Fetchers
- As mentioned in the previous section, the
- SRC_URI is normally used to
- tell BitBake which files to fetch.
- And, the fetcher BitBake uses depends on the how
- SRC_URI is set.
-
-
-
- These next few sections describe the available fetchers and
- their options.
- Each fetcher honors a set of variables URI parameters,
- which are separated by semi-colon characters and consist
- of a key and a value.
- The semantics of the variables and parameters are
- defined by the fetcher.
- BitBake tries to have consistent semantics between the
- different fetchers.
+ As mentioned earlier, the URL prefix determines which
+ fetcher submodule BitBake uses.
+ Each submodule can support different URL parameters,
+ which are described in the following sections.
- Local file fetcher
+ Local file fetcher (file://)
- The URN for the local file fetcher is file.
-
-
-
- The filename can be either absolute or relative.
- If the filename is relative,
+ This submodule handles URLs that begin with
+ file://.
+ The filename you specify with in the URL can
+ either be an absolute or relative path to a file.
+ If the filename is relative, the contents of the
FILESPATH
- is used.
+ variable is used in the same way
+ PATH is used to find executables.
Failing that,
FILESDIR
is used to find the appropriate relative file.
+
+ FILESDIR is deprecated and can
+ be replaced with FILESPATH.
+ Because FILESDIR is likely to be
+ removed, you should not use this variable in any new code.
+
+ If the file cannot be found, it is assumed that it is available in
+ DL_DIR
+ by the time the download() method is called.
- The metadata usually extend these variables to include
- variations of the values in
- OVERRIDES.
- Single files and complete directories can be specified.
+ If you specify a directory, the entire directory is
+ unpacked.
+
+
+
+ Here are some example URLs:
SRC_URI = "file://relativefile.patch"
SRC_URI = "file://relativefile.patch;this=ignored"
@@ -166,36 +298,53 @@
- CVS fetcher
+ CVS fetcher ((cvs://)
- The URN for the CVS fetcher is cvs.
-
-
-
- This fetcher honors the variables CVSDIR,
- SRCDATE, FETCHCOMMAND_cvs,
- UPDATECOMMAND_cvs.
- The
- DL_DIR
- variable specifies where a
- temporary checkout is saved.
- The
- SRCDATE
- variable specifies which date to
- use when doing the fetching.
- The special value of "now" causes the checkout to be
- updated on every build.
- The FETCHCOMMAND and
- UPDATECOMMAND variables specify the executables
- to use for the CVS checkout or update.
+ This submodule handles checking out files from the
+ CVS version control system.
+ You can configure it using a number of different variables:
+
+ FETCHCMD_cvs:
+ The name of the executable to use when running
+ the cvs command.
+ This name is usually "cvs".
+
+ SRCDATE:
+ The date to use when fetching the CVS source code.
+ A special value of "now" causes the checkout to
+ be updated on every build.
+
+ CVSDIR:
+ Specifies where a temporary checkout is saved.
+ The location is often DL_DIR/cvs.
+
+ CVS_PROXY_HOST:
+ The name to use as a "proxy=" parameter to the
+ cvs command.
+
+ CVS_PROXY_PORT:
+ The port number to use as a "proxyport=" parameter to
+ the cvs command.
+
+
+ As well as the standard username and password URL syntax,
+ you can also configure the fetcher with various URL parameters:
The supported parameters are as follows:
+ "method":
+ The protocol over which to communicate with the cvs server.
+ By default, this protocol is "pserver".
+ If "method" is set to "ext", BitBake examines the
+ "rsh" parameter and sets CVS_RSH.
+ You can use "dir" for local directories.
+ "module":
Specifies the module to check out.
+ You must supply this parameter.
"tag":
Describes which CVS TAG should be used for
@@ -210,23 +359,36 @@
The special value of "now" causes the checkout to be
updated on every build.
- "method":
- By default pserver.
- If "method" is set to "ext", BitBake examines the "rsh"
- parameter and sets CVS_RSH.
- "localdir":
- Used to checkout force into a special
+ Used to rename the module.
+ Effectively, you are renaming the output directory
+ to which the module is unpacked.
+ You are forcing the module into a special
directory relative to CVSDIR.
"rsh"
Used in conjunction with the "method" parameter.
"scmdata":
- I need a description for this.
+ Causes the CVS metadata to be maintained in the tarball
+ the fetcher creates when set to "keep".
+ The tarball is expanded into the work directory.
+ By default, the CVS metadata is removed.
+
+ "fullpath":
+ Controls whether the resulting checkout is at the
+ module level, which is the default, or is at deeper
+ paths.
+
+ "norecurse":
+ Causes the fetcher to only checkout the specified
+ directory with no recurse into any subdirectories.
+
+ "port":
+ The port to which the CVS server connects.
- Following are two examples using cvs:
+ Some example URLs are as follows:
SRC_URI = "cvs://CVSROOT;module=mymodule;tag=some-version;method=ext"
SRC_URI = "cvs://CVSROOT;module=mymodule;date=20060126;localdir=usethat"
@@ -235,19 +397,27 @@
- HTTP/FTP fetcher
+ HTTP/FTP wget fetcher (http://, ftp://, https://)
- The URNs for the HTTP/FTP fetcher are http, https, and ftp.
+ This fetcher obtains files from web and FTP servers.
+ Internally, the fetcher uses the wget utility.
- This fetcher honors the variables
- FETCHCOMMAND_wget.
- The FETCHCOMMAND variable
- contains the command used for fetching.
- “${URI}” and “${FILES}” are replaced by the URI and
- the base name of the file to be fetched.
+ The executable and parameters used are specified by the
+ FETCHCMD_wget variable, which defaults
+ to a sensible values.
+ The fetcher supports a parameter "downloadfilename" that
+ allows the name of the downloaded file to be specified.
+ Specifying the name of the downloaded file is useful
+ for avoiding collisions in
+ DL_DIR
+ when dealing with multiple files that have the same name.
+
+
+
+ Some example URLs are as follows:
SRC_URI = "http://oe.handhelds.org/not_there.aac"
SRC_URI = "ftp://oe.handhelds.org/not_there_as_well.aac"
@@ -257,36 +427,46 @@
- SVN Fetcher
+ Subversion (SVN) Fetcher (svn://)
- The URN for the SVN fetcher is svn.
-
-
-
- This fetcher honors the variables
- FETCHCOMMAND_svn,
- SVNDIR,
- and
- SRCREV.
- The FETCHCOMMAND variable contains the
- subversion command.
- The SRCREV variable specifies which revision
- to use when doing the fetching.
+ This fetcher submodule fetches code from the
+ Subversion source control system.
+ The executable used is specified by
+ FETCHCMD_svn, which defaults
+ to "svn".
+ The fetcher's temporary working directory is set
+ by SVNDIR, which is usually
+ DL_DIR/svn.
The supported parameters are as follows:
- "proto":
- The Subversion protocol.
+ "module":
+ The name of the svn module to checkout.
+ You must provide this parameter.
+ You can think of this parameter as the top-level
+ directory of the repository data you want.
+
+ "protocol":
+ The protocol to use, which defaults to "svn".
+ Other options are "svn+ssh" and "rsh".
+ For "rsh", the "rsh" parameter is also used.
"rev":
- The Subversion revision.
+ The revision of the source code to checkout.
+
+ "date":
+ The date of the source code to checkout.
+ Specific revisions are generally much safer to checkout
+ rather than by date as they do not involve timezones
+ (e.g. they are much more deterministic).
"scmdata":
- Set to "keep" causes the “.svn” directories
- to be available during compile-time.
+ Causes the “.svn” directories to be available during
+ compile-time when set to "keep".
+ By default, these directories are removed.
Following are two examples using svn:
@@ -298,40 +478,150 @@
- GIT Fetcher
+ GIT Fetcher (git://)
- The URN for the Git Fetcher is git.
+ This fetcher submodule fetches code from the Git
+ source control system.
+ The fetcher works by creating a bare clone of the
+ remote into GITDIR, which is
+ usually DL_DIR/git.
+ This bare clone is then cloned into the work directory during the
+ unpack stage when a specific tree is checked out.
+ This is done using alternates and by reference to
+ minimize the amount of duplicate data on the disk and
+ make the unpack process fast.
+ The executable used can be set with
+ FETCHCMD_git.
- The variable GITDIR is used as the
- base directory in which the Git tree is cloned.
-
-
-
- The supported parameters are as follows:
+ This fetcher supports the following parameters:
- "tag":
- The Git tag.
- The default is "master".
- "protocol":
- The Git protocol.
+ The protocol used to fetch the files.
The default is "git" when a hostname is set.
If a hostname is not set, the Git protocol is "file".
+ You can also use "http", "https", "ssh" and "rsync".
- "scmdata":
- When set to “keep”, the “.git” directory is available
- during compile-time.
+ "nocheckout":
+ Tells the fetcher to not checkout source code when
+ unpacking when set to "1".
+ Set this option for the URL where there is a custom
+ routine to checkout code.
+ The default is "0".
+
+ "rebaseable":
+ Indicates that the upstream Git repository can be rebased.
+ You should set this parameter to "1" if
+ revisions can become detached from branches.
+ In this case, the source mirror tarball is done per
+ revision, which has a loss of efficiency.
+ Rebasing the upstream Git repository could cause the
+ current revision to disappear from the upstream repository.
+ This option reminds the fetcher to preserve the local cache
+ carefully for future use.
+ The default value for this parameter is "0".
+
+ "nobranch":
+ Tells the fetcher to not check the SHA validation
+ for the branch when set to "1".
+ The default is "0".
+ Set this option for the recipe that refers to
+ the commit that is valid for a tag instead of
+ the branch.
+
+ "bareclone":
+ Tells the fetcher to clone a bare clone into the
+ destination directory without checking out a working tree.
+ Only the raw Git metadata is provided.
+ This parameter implies the "nocheckout" parameter as well.
+
+ "branch":
+ The branch(es) of the Git tree to clone.
+ If unset, this is assumed to be "master".
+ The number of branch parameters much match the number of
+ name parameters.
+
+ "rev":
+ The revision to use for the checkout.
+ The default is "master".
+
+ "tag":
+ Specifies a tag to use for the checkout.
+ To correctly resolve tags, BitBake must access the
+ network.
+ For that reason, tags are often not used.
+ As far as Git is concerned, the "tag" parameter behaves
+ effectively the same as the "revision" parameter.
+
+ "subpath":
+ Limits the checkout to a specific subpath of the tree.
+ By default, the whole tree is checked out.
+
+ "destsuffix":
+ The name of the path in which to place the checkout.
+ By default, the path is git/.
- Following are two examples using git:
+ Here are some example URLs:
SRC_URI = "git://git.oe.handhelds.org/git/vip.git;tag=version-1"
SRC_URI = "git://git.oe.handhelds.org/git/vip.git;protocol=http"
+
+
+ Other Fetchers
+
+
+ Fetch submodules also exist for the following:
+
+
+ Bazzar (bzr://)
+
+
+ Perforce (p4://)
+
+
+ SVK
+
+
+ Git Submodules (gitsm://)
+
+
+ Trees using Git Annex (gitannex://)
+
+
+ Secure FTP (sftp://)
+
+
+ Secure Shell (ssh://)
+
+
+ Repo (repo://)
+
+
+ OSC (osc://)
+
+
+ Mercurial (hg://)
+
+
+ No documentation currently exists for these lesser used
+ fetcher submodules.
+ However, you might find the code helpful and readable.
+
+
+
+
+
+ Auto Revisions
+
+
+ We need to document AUTOREV and
+ SRCREV_FORMAT here.
+