Configuration

For instructions on how to build Tracker from source, see the README.md file in the Git repo.

Environment Variables

All of the environment variables below can be seen in use in the tracker-sandbox.py script, which was written to show how Tracker can be run in an isolated case just by configuring environment variables.

Tracker Specific

There is a good summary of environment variables provided by the generated documentation as part of libtracker-sparql. Not all environment variables on the page in the above link apply to all binaries/components. For information about which environment variables apply to which binary, you can use the man pages on the terminal, e.g.  man tracker-store  or  man tracker-miner-fs  etc.

Environment Variable

Description

'TRACKER_DB_ONTOLOGIES_DIR'

Default is $PREFIX/share/tracker/ontology/. The location of the ontology is important because tracker-store will use this to check for database schema migrations on start up and if the database is non-existent, will use the ontology it finds here to create a new database.

'TRACKER_MINERS_DIR'

Default is $PREFIX/share/tracker/miners/. The miners directory has files per miner so the TrackerMinerManager knows about ALL miners and can easily contact them to get status/pause/resume/etc. If miners don't add the relevant files here, tracker-control is unable to control them.

'TRACKER_EXTRACTOR_RULES_DIR'

Default is $PREFIX/share/tracker/extractor-rules/. The extractor rules help tracker-extract know how to process different files and which extractors should handle which mime types.

'TRACKER_LANGUAGE_STOPWORDS_DIR'

Default is $PREFIX/share/tracker/stop-words/. Stop words are important when using the Full Text Search feature. They are a black list of common words (per locale) that we can avoid indexing. Words like 'the' for example. If you want to load stop words from another location, you can use this environment variable.

'TRACKER_FTS_STOP_WORDS'

Default is that this is unset. If it is set and has a value of 0 it means that stop words (words like 'the', 'is', etc) are indexed.

Others affecting Tracker binaries / components

Environment Variable

Description

'XDG_DATA_HOME'

Default is $HOME/.local/share/. Tracker stores the journals and logs in here under a 'tracker' subdirectory.

'XDG_CONFIG_HOME'

Default is $HOME/.config/. Tracker stores configuration files in here under a 'tracker' subdirectory, but ONLY if the 'TRACKER_USE_CONFIG_FILES' environment variable is defined.

'XDG_CACHE_HOME'

Default is $HOME/.cache/. Tracker stores the database and any cache in here under a 'tracker' subdirectory.

'XDG_RUNTIME_DIR'

This is used as the base directory relative to which user-specific non-essential runtime files and other file objects (such as sockets, named pipes, ...) should be stored. The directory MUST be owned by the user, and he MUST be the only one having read and write access to it. Its Unix access mode MUST be 0700.

'XDG_DATA_DIRS'

Default is $PREFIX/local/share/:$PREFIX/share/. It's used to know which directories to search for application data outside of $HOME.

'G_MESSAGES_DEBUG'

Tracker uses a lot of GLib debug logging calls. If this is not set (I use "all"), your tracker binaries won't show debug messages explaining what they are doing.

'DBUS_VERBOSE'

If set (to anything), DBus will be verbose about what it is doing when you run dbus-daemon.

How processes are started

Methods

The Tracker processes installed into $PREFIX/libexec (tracker-store, tracker-miner-fs, ...) can be started a number of ways.

  • Manually.
  • Via DBus using one of the many APIs provided.
  • Via Desktop files (which is related to DBus instantiation) and when the desktop is started.
  • From shell scripts (many headless systems use this approach).

The Tracker processes mentioned have desktop files which allow the binaries to be started when the computer starts. You can find the desktop files in the source code repository. The important key in those files is X-GNOME-Autostart-enabled which is technically an extension of the desktop file specification (there is a different key for KDE and other desktop environments. The default is to start these processes on computer start up. You can easily change this.

How should processes be started?

  1. tracker-miner-fs: Typically this is the main process that needs to be started (to ensure data population).

  2. tracker-store: Use of libtracker-sparql (which tracker-miner-fs uses heavily), ensures that tracker-store is started (for database checking, integrity, migration paths, etc). This process is only needed initially and for database updates. For simple queries, it's idle most the time (unless configured differently, see TRACKER_BACKEND environment variable).

  3. tracker-extract: This process listens for changes coming from the database and should be started early on because it reacts to changes from processes like tracker-miner-fs. It's not necessary though, it will discover all files that are unprocessed even if it wasn't started in time.

  4. tracker-writeback: If you would like data written back to files (for example tags in MP3s), you can start this process next.

GSettings / INI file

The settings are stored in data/gschema/* and have detail including acceptable values for each setting and a decent description of what each setting does. There is one schema per binary (usually):

$ cd data/gschemas
$ ls *.xml | sort | uniq
data/gschemas/org.freedesktop.Tracker.DB.gschema.xml
data/gschemas/org.freedesktop.Tracker.enums.xml
data/gschemas/org.freedesktop.Tracker.Extract.gschema.xml
data/gschemas/org.freedesktop.Tracker.FTS.gschema.xml
data/gschemas/org.freedesktop.Tracker.gschema.xml
data/gschemas/org.freedesktop.Tracker.Miner.Files.gschema.xml
data/gschemas/org.freedesktop.Tracker.Store.gschema.xml
data/gschemas/org.freedesktop.Tracker.Writeback.gschema.xml

A quick way to find out what settings are available on any given installed version of tracker is to use:

$ gsettings list-recursively | grep -i org.freedesktop.Tracker | sort | uniq
org.freedesktop.Tracker.DB journal-chunk-size 50
org.freedesktop.Tracker.DB journal-rotate-destination ''
org.freedesktop.Tracker.Extract max-bytes 1048576
org.freedesktop.Tracker.Extract max-media-art-width 0
org.freedesktop.Tracker.Extract sched-idle 'first-index'
org.freedesktop.Tracker.Extract verbosity 'errors'
org.freedesktop.Tracker.Extract wait-for-miner-fs false
org.freedesktop.Tracker.FTS enable-stemmer false
org.freedesktop.Tracker.FTS enable-unaccent true
org.freedesktop.Tracker.FTS ignore-numbers true
org.freedesktop.Tracker.FTS ignore-stop-words true
org.freedesktop.Tracker.FTS max-word-length 30
org.freedesktop.Tracker.FTS max-words-to-index 10000
org.freedesktop.Tracker.Miner.Files crawling-interval -1
org.freedesktop.Tracker.Miner.Files enable-monitors true
org.freedesktop.Tracker.Miner.Files enable-writeback true
org.freedesktop.Tracker.Miner.Files ignored-directories ['core-dumps', 'CVS', 'lost+found', 'po']
org.freedesktop.Tracker.Miner.Files ignored-directories-with-content ['backup.metadata']
org.freedesktop.Tracker.Miner.Files ignored-files ['*~', 'autom4te', '*.aux', 'confdefs.h', 'config.status', 'configure', 'confstat', 'conftest', '*.csproj', '*.gmo', '*.in', '*.la', 'libtool', '*.lo', '*.loT', 'ltmain.sh', '*.lzo', '*.m4', 'Makefile', '*.nvram', '*.o', '*.omf', '*.orig', '*.part', '*.pc', '*.po', '*.rcore', '*.rej', 'SCCS', '*.tmp', '*.vm*', '*.vmdk']
org.freedesktop.Tracker.Miner.Files index-on-battery false
org.freedesktop.Tracker.Miner.Files index-on-battery-first-time true
org.freedesktop.Tracker.Miner.Files index-optical-discs true
org.freedesktop.Tracker.Miner.Files index-recursive-directories ['&DESKTOP', '&DOCUMENTS', '&DOWNLOAD', '&MUSIC', '&PICTURES', '&VIDEOS']
org.freedesktop.Tracker.Miner.Files index-removable-devices false
org.freedesktop.Tracker.Miner.Files index-single-directories ['$HOME']
org.freedesktop.Tracker.Miner.Files initial-sleep 15
org.freedesktop.Tracker.Miner.Files low-disk-space-limit -1
org.freedesktop.Tracker.Miner.Files removable-days-threshold 3
org.freedesktop.Tracker.Miner.Files sched-idle 'first-index'
org.freedesktop.Tracker.Miner.Files throttle 0
org.freedesktop.Tracker.Miner.Files verbosity 'errors'
org.freedesktop.Tracker.Store graphupdated-delay 1000
org.freedesktop.Tracker.Store verbosity 'errors'
org.freedesktop.Tracker.Writeback verbosity 'errors'

System Resource Management

In the past, Tracker has been condemned for using too many resources on user's computers. There are a number of mechanisms in place which are employed to help prevent this including:

  • I/O Priority - ioprio_set(): Used for changing the Input/Output (or disk) based scheduling. With tracker-miner-fs and tracker-store making heavy use of the disk with crawling the file system or modifying the database, this is needed.

    • The kernel has a document explaining this in more detail including examples.

    • Related to this is ionice, a command line tool doing the same.

  • CPU Scheduler - sched_setscheduler(): Only used with tracker-miner-fs and tracker-extract and set to SCHED_IDLE to tell the scheduler to give priority to other applications by default. There are options in the settings above to determine if we use this on first index, always or not at all.

    • See configuration options org.freedesktop.Tracker.Miner.Files.sched_idle for tracker-miner-fs and org.freedesktop.Tracker.Extract.sched_idle for tracker-extract.

  • CPU Scheduler - nice(): Only used with tracker-miner-fs and tracker-extract because they steal CPU time slices otherwise. We don't use this with tracker-store because that's automatically indirectly controlled by the requests coming from tracker-miner-fs. We use nice(19) in cases where the API is used.

    • Related to this is nice, a command line tool doing the same.

    • Why use nice() and sched_setscheduler()? Well, nice() doesn't apply when SCHED_IDLE is used, but we don't always use it (as mentioned above, it's configurable, but defaults to being on). So in cases where we don't use SCHED_IDLE, we still want to be a lower priority, just not as low. Technically, SCHED_IDLE is much lower priority than nice(19).
  • Disk space - statvfs(): Obviously Tracker needs disk space to operate. Tracker uses UPower or HAL to check the disk space conditions on a device.

    • See configuration option org.freedesktop.Tracker.Miner.Files.low-disk-space-limit used by tracker-miner-fs. Values indicate if we pause indexing at a percentage of low disk space (0-100% or -1 to disable the check entirely).

    • Additionally, we require at least 5Mb of disk space on the partition the database is being saved to to be able to start tracker-store for the first time (when databases are created).

  • Battery: The tracker-miner-fs process has settings to disable indexing according to battery conditions. This can be useful for laptops and small scale devices.

    • See configuration options org.freedesktop.Tracker.Miner.Files.index-on-battery for when your device is running on battery instead of AC power and org.freedesktop.Tracker.Miner.Files.index-on-battery-first-time to override the previous option but ONLY on the first index.

  • Throttle - sleep(): The tracker-miner-fs has a setting to call sleep() between processing resources (files) thereby throttling the indexing process.

    • See configuration option org.freedesktop.Tracker.Miner.Files.throttle.

Other techniques tried and tested in the past (but not used in or by Tracker):

  • CGroups: In the past some businesses (like Nokia) using Tracker on embedded systems have used CGroups to limit the resources the Tracker processes have. By putting Tracker processes into a group with limited resources, it means that if Tracker runs out of memory or tries uses maximum CPU power, the rest of the system running in other groups won't be starved of CPU time slices and should still have some system memory left to make use of. This is a system level configuration and not something Tracker is making use of internally.

    • The kernel has a document explaining this in more detail.

  • Rogue Process Management - oom_score_adj or oom_adj: In some cases, this technique has been used to make sure rogue extractors implemented in tracker-extract or bugs in tracker-miner-fs don't allow these processes to steal of system memory. The basics of this option are that you volunteer your process to be one of the first to be killed off by the kernel in cases where the system is seriously lacks memory.

    • The kernel has a document explaining this in more detail.

  • Memory allocator limitation - setrlimit(): Tracker currently has a function called tracker_memory_setrlimits() which sets the limitation of RLIMIT_AS and RLIMIT_DATA clamping it between 50% of total memory and MAXLONG (2GB on 32-bit) as an upper limit. RLIMIT_AS affects calls like mmap() (which are used by the MP3 and other extractors). RLIMIT_DATA affects the initialized data, uninitialized data and heap allocations:

  • Memory allocator watchdog - osso_mem_saw_enable(): Nokia implemented this to make sure processes didn't accidentally try to allocate too much memory. Tracker used it at some time or another, but no longer.

Attic/Tracker/Documentation/Configuration (last edited 2023-08-14 12:50:16 by CarlosGarnacho)