Skip to content

Concourse Release Notes

v6.7.6

Release Date: March 30, 2021

Security Fixes

  • Updated openssl in the git resource to resolve latest openssl CVEs
  • Update Ubuntu OS packages in the Concourse image to resolve latest CVEs

Resolved Issues

Resolved Core Functionality Issues

These are the fixes to the core functionality of Concourse:

  • Enhanced var_sources to handle slow Vault, like Vault login might take longer than 5 seconds.
  • Fixed an ATC crash caused by parallel load_var steps. When multiple load_var steps were running in parallel, there was the risk of the ATC crashing due to concurrent map writes
  • Fixed an OIDC connector issue where group claims were not fetched.
    • Fixed a regression that introduced in 6.7.3 that OIDC connector by default not fetching groups claim. Now OIDC connector always fetches groups claim unless --oidc-disable-groups flag is set.
    • Some OIDC providers don't include the email_verified claim, which causes a validation error by default. To support these providers, you can set CONCOURSE_OIDC_SKIP_EMAIL_VERIFIED_VALIDATION to true
  • Fixed an issue where aborted builds were retried when the worker entered a bad state
  • Bump lib/pq to 1.10.0 which fixes a regression in lib/pq where under certain circumstances the driver would not drop dead connections and never recover.
  • Fixed a panic in the New Relic metrics emitter

v6.7.3

Release Date: January 21, 2021

Upgrading

Warning

If you are upgrading a BOSH deployed concourse, you might need to upgrade BPM to use at least BPM 1.1.9. There is a known issue with using BPM to manage newer versions of runc. See the BOSH section for more information.

Warning

Due to a change in gdn, if you rely on your containers accessing the host worker's network (e.g. via a task), you must set garden.allow_host_access to true. See the Breaking Changes section for more information.

Warning

Team RBAC for OIDC and Bitbucket Cloud auth connectors are broken. These are known issues and will be fixed in the next patch (v6.7.4). Please do not upgrade if these affect you. OIDC connector is ignoring the groups claim in responses. Further details here. Bitbucket Cloud connector relies on a deprecated API endpoint. Further details here.

BOSH

  • The Concourse worker process now depends, via gdn v1.19.16, on runc v1.0.0-rc91. There is a known issue with using BPM to manage newer versions of runc, which has been resolved in bpm-release v1.1.9.
    • If your Concourse is deployed using BOSH, you will also need to use at least BPM 1.1.9.
    • After upgrading, if you start to see build steps erroring with a message similar to:
      1
      runc run: exit status 1: container_linux.go:349: starting container process caused "process_linux.go:439: container init caused \"process_linux.go:405: setting cgroup config for procHooks process caused \\\"failed to write \\\\\\\"c 5:1 rwm\\\\\\\" to \\\\\\\"/sys/fs/cgroup/devices/system.slice/concourse.service/garden/a206550f-f6dd-4609-4f13-0a11afd3fd93/devices.allow\\\\\\\": write /sys/fs/cgroup/devices/system.slice/concourse.service/garden/a206550f-f6dd-4609-4f13-0a11afd3fd93/devices.allow: operation not permitted\\\"\""
      
      then you probably need to upgrade BPM.

Breaking Changes

  • Team RBAC for OIDC and Bitbucket Cloud auth connectors are broken
    • These are known issues and will be fixed in the next patch (v6.7.4). Please do not upgrade if these affect you
    • OIDC connector is ignoring the groups claim in responses
    • Bitbucket Cloud connector relies on a deprecated API endpoint
  • Allowing your build containers to access the worker's network (including the BOSH DNS nameserver) requires opting-in to the allow_host_access garden configuration
    • While this was a configuration option before, previous versions of gdn (and hence Concourse) defaulted to enabling this setting - now, it is disabled by default
    • Instructions for enabling this flag can be found in the Allowing Host Access guide
  • Bump Dex to 2.27.0 which fixes a vulnerability in the go XML library
    • OIDC connector
    • preferred_username claim is now takes precendence over the --oidc-user-name-key auth flag Previously, the preferred_username claim would be ignored in favor of the --oidc-user-name-key concourse auth flag. Now, the preferred_username claim takes precedence so its value will be used as concourse OIDC username.
  • Generate opaque OAuth2 access tokens
    • There were several issues that users encountered (particularly after v6.1.0) as a result of long access tokens. Concourse now generates much shorter access tokens rather than using the raw user data.
    • Users' last activity is now tracked on login rather than on every request. Updating the last activity on every request caused database problems at scale.
      • Note: Last activity is only relevant to fly active-users.
    • Note: This breaking change only applies to custom automation built around Concourse that authenticates with the Concourse API.
  • Fixed Gitlab auth to reference a user's username instead of their full name
    • If a team was configured for Gitlab auth with the --gitlab-user flag, you must ensure the user referenced is a valid Gitlab username, rather than a user's full name. If not, you must reconfigure the team with fly set-team to reference the username. Teams configured with --gitlab-group are unaffected.
  • Added format: trim to load_var step and make it the default
    • format: trim removes all trailing and leading whitespace from the input file
    • The prior behavior of keeping all whitespace can be used by specifying format: raw (see load_var.format)
  • Prefix containerd specific flags with CONCOURSE_CONTAINERD
    • This is only a breaking change if you were using the experimental containerd backend. The default guardian backend remains unaffected and is still configured with CONCOURSE_GARDEN prefixed environment variables.

Features

New features and changes in this release:

Fly Commands

These are the new features and changes to fly commands:

  • Emit warnings for invalid identifiers
    • A warning will be emitted for every identifier that doesn't match the validation rules described in the identifier schema
    • In a future version there will be an error when identifiers don't match the validation rules - so, we recommend adhering to the new validation as soon as possible!
  • Support set-pipeline: self for configuring current pipeline
    • This feature is experimental, and may be removed in a future version
  • Added a --include-archived flag for the fly pipelines command in order to include archived pipelines in the list of returned pipelines.
  • Added a way of renaming pipeline resources while preserving version history by updating the resource name (as well as any references in steps) and specifying its old name as old_name within the resource configuration. After the pipeline has been configured, the old_name field can be removed.
  • Added a --team flag to the fly set-pipelines command in order to allow for setting a pipeline under a different team than the one the user is currently logged into.
  • Added --team flag to the pause-pipeline, hide-pipeline, destroy-pipeline commands for similar reasons as above.
  • The --config flag of the fly set-pipeline command now supports - for reading pipeline config from stdin.
  • A warning is printed after fly set-pipeline if pipeline is paused.
  • Added flag --ignore-event-parsing-errors to fly watch to ignore event parsing errors when an unknown event type or version is encountered.
  • Allow the fly http transport to use client certificates.
    • Adds new --client-cert and --client-key flags to the fly login command. The provided client certificates will then be used by fly's http transport.
  • Pipelines are now validated to ensure that they contain at least one job. Pipeline configurations with no jobs will be rejected.

Resources

These are the new features and changes to Concourse resources:

  • Registry-Image Resource: and Docker-Image Resource:
    • Now uses a HEAD request rather than GET during the check operation so that they don't count towards the DockerHub rate limit. A fallback to GET requests will be used for registries that do not support HEAD requests
    • This is part of the mitigations for DockerHub rate limiting that began November 2.

Core Functionality

These are the new features and changes to the core functionality of Concourse:

  • Automatically archive abandoned pipelines
    • An abandoned pipeline is one that was once set by a set_pipeline step but no longer is. See the set_pipeline docs for more information
    • The temporary feature flag --enable-archive-pipeline was removed as the archiving pipelines feature is complete
  • Added experimental across step for running build plans across a matrix of values
    • This feature must be enabled using --enable-across-step as it is subject to change - don't be alarmed if your pipelines with across stop working in a future release!
    • There's no official documentation yet, but there are some examples of usage patterns in the RFC.
  • Allow dot and colon in variable path
    • You can now interpolate variables with special characters . and : in the name by wrapping them in double quotes
    • Example: (("some.secret".field1)) accesses field1 of the secret some.secret
  • Rerun builds with baggageclaim network errors
    • The --enable-rerun-when-worker-disappears flag now supports rerunning builds after any network error from the ATC to the worker's baggageclaim. Such network errors are common when the worker disappears.
    • Builds will now be rerun when this flag is enabled and the failing step is a nested step (e.g. within an in_parallel)
  • OPA integration: Integrate Concourse with OPA to enable policy enforcement ability. There is an open RFC #41 for the integration, therefore the feature is still in progress.

    Experimental

    This is an experimental feature and not recommended for production use at this point.

  • Allow rotating the encryption key via concourse migrate:

    • The concourse migrate command can be called with --old-encryption-key to rotate the database encryption key as a one-time operation
    • The concourse web command still accepts --old-encryption-key

    Stop ATCs before running the command

    You should stop any ATCs prior to running the concourse migrate command.

  • Support SAML 2.0 as an auth backend.

  • Stricter validation:
    • Refactor existing step structure to simplify introducing new steps. The primary user-facing changes are stricter validation and slightly different step validation messages. Previously, fields that were not part of a step wouldn't have failed validation, however, now they will. This will impact stateful actions such as fly set-pipeline.
  • Enable secret caching for var_sources.
  • Enhance TasksWaiting metric to include teamId, workerTags, and platform labels
  • TSA's garden client timeout can be configured using --tsa-garden-request-timeout
  • Updated the set_pipeline step to be used across teams using the teams: field within the step.

    • This can only be used by pipelines set by the main team (or any team that has admin privileges).

    Experimental

    This is an experimental feature and not recommended for production use at this point.

  • Remove unnecessary updates to the resource's check_error value

  • Added mitigations for DockerHub rate limiting. On November 2, DockerHub severely limited the amount of requests free and anonymous users can send to the DockerHub API. This impacted Concourse as resource check and get generated lots of requests. Mitigations include switching the resources to use non rate limited API calls and allowing the ability to configure defaults for resource-types.
  • Allow configuring default source for resource types
    • At the cluster level, an admin can configure default source configurations for the base resource types that come bundled with concourse (e.g. git, registry-image, docker-image, github-release, etc...). This is done by specifying the --base-resource-type-defaults to a file that contains a map of resource-type names to their defaults. The following would configure all registry-image resources in the cluster to use a registry mirror:
      1
      2
      3
      registry-image:
          registry_mirror:
              host: internal.docker.mirror.com
      
    • At the pipeline level, an user can configure defaults for any resource-type in the pipeline by adding a defaults: field. The following would configure any gitlab resource used in the pipeline to have a default username and password:
      1
      2
      3
      4
      5
      6
      7
      8
      resource_types:
      - name: gitlab
        type: registry-image
        source:
            repository: ...
        defaults:
            username: ...
            password: ...
      
    • The defaults can be overwritten at the individual step level.
    • This is part of the mitigations for DockerHub rate limiting that began November 2.
  • Optionally skip resource checking for put-only resources.
  • Resource checking for put resources can be optionally turned off by using the feature flag CONCOURSE_ENABLE_SKIP_CHECKING_NOT_IN_USE_RESOURCES
  • This behaviour was the default in 6.3.0 but has been made optional because it has two side effecs: 1) put-only resources will no longer show version history in the UI, and 2) custom resource types of put-only resources will no longer be automatically checked.
  • The performance gains can still be significant for some operators so use this if you are okay with the side effects.
  • An index - successful_build_outputs_rerun_of_idx has been added to speed up pipeline and build deletion

Runtime

These are the new features and changes to the Concourse runtime:

  • Concourse now supports reloading TSA worker keys by sending a SIGHUP signal to the web process. This allows new workers and their keys to be added without having to restart the atc.
  • Added the ability to mount Btrfs loopback with discard option. This punches holes in the underlying loop file making it sparse, and will potentially result in better disk utilization.
  • A maximum container limit for containerd backend can now be set with CONCOURSE_CONTAINERD_MAX_CONTAINERS
  • Added support for task.run.user field in containerd runtime.
  • Containerd now uses the configured resolv.conf
    • The resolv.conf is injected into the containers created by containerd on a worker.
    • When setting the configuration, the priority is given to the configuration provided by the user by setting concourse_garden_dns_servers property. If the user doesn't set this property, we use the worker host's dns configuration inside the container.
  • Added new default values to the guardian flags network-pool and max-containers.
    • If --network-pool is unset, guardian defaults to limiting a guardian worker to 250 containers. This is true even if max-containers is set to a higher value. The network-pool default is set to 10.80.0.0/16 to increase the allowable addresses significantly, ensuring the container count is only limited by the explicit max-containers flag.
  • Allow statx in containerd
    • Update containerd's seccomp profile to allow the statx system call. This lets basic commands like ls -l to be executed.

Web UI

These are the new features and changes to the Concourse web UI:

  • Favouriting pipelines: This feature adds the ability to select frequently viewed/used pipelines and have them prioritized within the dashboard and the sidebar.
    • Once selected as a favorite, there should be a new area/section within the dashboard that displays the pipeline tiles of favorite pipelines. This new area/section should appear at the very top of the dashboard.
    • Users will have the ability to both favorite and un-favorite pipelines within the sidebar, in the dashboard, and on the pipeline view.
  • Archived pipelines can be displayed in the web UI via a toggle switch in the top bar.
  • Archived pipelines in the sidebar will show an updated icon to help better distinguish archived pipelines.
  • Speed up querying for unencrypted builds
    • If your environment had a large number of builds and an encrypted database, you might have noticed your web node being slow to start up. An index was added to help speed up the querying of unencrypted builds, which is one of the queries that run during the web startup.
  • Custom background image
    • Pipeline authors can now include a custom background_image under the display key in pipeline config. E.g:
      1
      2
        display:
          background_image: https://avatars1.githubusercontent.com/u/7809479?s=400&v=4
      
  • The build log page and fly watch now display the worker name for get, put and task steps.
  • Auth token now shows on the login success page allowing the user to manually copy the token from the browser.
  • set_pipeline step header will indicate whether changes were applied.
    • If a set_pipeline step made any changes, it will indicate by highlighting in yellow and displaying pipeline config changed message when hovered over the step.

Resolved Issues

Resolved Fly Command Issues

These are the fixes to fly commands:

  • fly login now accepts arbitrarily long tokens when pasting the token manually into the console. Previously, the limit was OS dependent (with OSX having a relatively small maximum length of 1024 characters). This has been a long-standing issue, but it became most noticeable after 6.1.0 which significantly increased the size of tokens. Note that the pasted token is now hidden in the console output.
  • fly execute now works with images from custom resource types.
  • An error is now shown when a step only contains modifiers.
  • fly get-team --json now returns properly formatted JSON and works as intended.
  • Admins now see an error when logging in with an invalid or non-existent team name
  • fly validate-pipeline now accepts --enable-across-step as a valid step
  • Fixed a bug where running fly execute would fail in an environment in which all workers are tagged.
  • set_pipeline of a YML pipeline configuration file with no jobs: or resources: no longer causes a runtime error: invalid memory address or nil pointer dereference.
  • Fixed a bug where if a pipeline was set with the --team option, the subsequent unpause-pipeline command that was printed would not have the --team option

Resolved Core Functionality Issues

These are the fixes to the core functionality of Concourse:

  • Fixed a regression where builds could be stuck pending forever if an input was pinned by partially specifying a version via the version field on a get step, version field on the resource configuration or by running fly pin-resource.
  • Fixed a regression that prevented using both static vars and dynamic vars simultaneously in a task.
  • Pipelines can be re-ordered in the dashboard when filtering. This was a regression introduced in 6.0.0.
  • Fixed a validation issue where a step can be set with 0 attempts that causes the ATC to panic.
  • A potential database deadlock after pipeline deletion is now being prevented.
  • Made claims LRU cache safe for concurrent use. This avoids a potential race in access token caching, which would cause the ATC to panic.
  • Fixed a bug where a build couldn't be cancelled if it is in pending state because of unsatisfiable inputs.
  • Builds that get stuck in pending mode permanently because of missing input versions will now automatically abort
  • Save job.disable_manual_trigger to the database
    • Fixed the disable_manual_trigger: field on jobs -- since v6.0.0 it had no effect and jobs with this setting could actually still be manually triggered.
  • checks-enqueued in metrics is only incremented when a check is created.

Resolved Web UI Issues

These are the fixes to the web UI of Concourse:

  • The sidebar can now be expanded horizontally in the web UI in order to allow for flexibility in resizing. This will help in seeing long pipelines names in the sidebar.
  • Fixed pagination bugs on the resource version and job builds pages.
  • Fixed horizontal scrolling on the build page.
  • Fix an authorization bug where users might randomly hit the "forbidden" error when they have multiple roles in a team

Resolved Runtime Issues

These are the fixes to the Concourse runtime:

  • Bump baggageclaim to v1.8.0 to fix deeply-nested volumes with overlay driver. Note that this currently works for a nesting depth of four.
  • Concourse now only passes garden the configured defaults within Concourse for the guardian flags max-containers and network-pool if it is not set through the garden config file, environment variables, or flags.
  • guardian assets are removed on worker startup to ensure guardian dependencies are updated properly
  • Using the containerd backend inside a runc-managed process (e.g. running Concourse inside Docker) and your version of runc is v1.0.0-rc91 or above, creating privileged containers no longer fails with EPERM.

v6.3.1

Release Date: August 4, 2020

Security Fixes

Team Configuration

Any Concourse teams configured with GitLab users may need to be updated. Previously a GitLab users Full Name was used to add them to a Concourse team, now the users Username in GitLab is used by Concourse to verify team membership. If the Full Name and Username are the same then no change is necessary.

This release contians the following security fixes:

  • Fixes the GitLab auth connector not using the correct name:
  • Critical CVE-2020-5415:
    • A GitLab user could impersonate another user when logging into Concourse with GitLab, giving them that users access to Concourse.
    • Concourse teams configured through GitLab groups are not susceptible to this CVE

v6.3.0

Release Date: June 11, 2020

Upgrading

VMware Concourse provides upgrade guides describing the step-by-step process for upgrading both a BOSH and Helm-deployed Concourse.

Warning

Please expect and prepare for some downtime when upgrading to v6.3.0. On our large scale deployments we have observed 10-20 minutes of downtime as the database is migrated, but this time will vary depending on the size of your database.

BOSH

If you are currently using the VMware Concourse v5.2.7 BOSH release and below, VMware recommends first upgrading to v5.5.11 before upgrading to v6.3.0 as the most reliable upgrade path. Choose from the following guides based on the version you want to upgrade from in order to upgrade to version v5.5.x of Concourse:

Once you're on v5.5.x, you can follow our v6 upgrade guide:

Helm

If you are currently using the Helm release, you should already be on 5.5.x.


Breaking Changes

This release has the following breaking changes:

Fly Commands

These breaking changes affect the fly commands:

  • The query argument from the fly curl command has been removed. In the past, when passing curl options as fly curl <url_path> -- <curl_options>, the first curl option was parsed as query argument incorrectly, which caused unexpected curl behaviour. With this fix, <curl_options> functions as documented and the way to add query params to fly curl is more intuitive: fly curl <url_path?query_params> -- <curl_options>.

Core Functionality

These breaking changes affect the core functionality of Concourse:

  • A new algorithm for determining inputs for jobs has been implemented: This new algorithm significantly reduces resource utilization on the web and db nodes, especially for long-lived and/or large-scale Concourse installations.

    • The old algorithm used to load up all the resource versions, build inputs, and build outputs into memory then use brute-force to figure out what the next inputs would be. While this method worked well enough in most cases, in scenarios where users had long-lived deployment with thousands (or even millions) of versions or builds it would start to put a lot of strain on the web and db nodes in order to load up the data set.
    • The new algorithm takes a very different approach which does not require the entire dataset to be held in memory and cuts out nearly all of the "brute force" aspect of the old algorithm. The new algorithm makes use of Postgres' jsonb index functionality; a successful build's set of resource versions are stored in a table which Concourse can easily "intersect" in order to find matching candidates when evaluating passed constraints.
    • As an addition to this feature, if the algorithm fails to find a satisfactory set of inputs, the reason will now be shown for each input in the build preparation. This should make it easier to troubleshoot why a build is in a "pending" state.
    • Breaking Change: For inputs with passed constraints, the algorithm now chooses versions based on the build history of each job in the passed constraint, rather than version history of the input's resource. There should be little difference in behavior from a user's standpoint but the result of using build history rather than version history in the new algorithm might show in several edge cases. These cases typically involve an input with version: every and passed constraints that is jumping around versions due to pinning or disabling.

      Migrating Existing Data

      Given the huge changes to how the scheduler uses data, you may be wondering how the upgrade's data migration works. Instead of doing it all at once in a migration at startup, the algorithm will migrate data for builds as it needs to. Overall, this should result in very little work to do as most jobs will have a satisfiable set of inputs without having to go too far back in the history of upstream jobs.

  • Operators can now limit the number of concurrent API requests that their web node will serve by passing a flag like --concurrent-request-limit action:limit where action is the API action name as they appear in the action matrix in our docs.

    • If the web node is already concurrently serving the maximum number of requests allowed by the specified limit, any additional concurrent requests will be rejected with a 503 Service Unavailable status. If the limit is set to 0, the endpoint is effectively disabled, and all requests will be rejected with a 501 Not Implemented status.

      • Currently the only API action that can be limited in this way is ListAllJobs. If the ListAllJobs endpoint is disabled completely (with a concurrent request limit of 0), the dashboard reflects this by showing empty pipeline cards labeled 'no data'.

      If you use this configuration, it is possible for super-admins to effectively deny service to non-super-admins.

      This is because when super-admins look at the dashboard, the API returns a huge amount of data (much more than the average user) and it can take a long time (over 30s on some larger clusters) to serve the request.

      If you have multiple super-admin dashboards open, they will constantly consume some portion of the number of concurrent requests your web node will allow. Any other requests, even if they are potentially cheaper for the API to service, are much more likely to be rejected because the server is overloaded by super-admins.

      Still, the web node will no longer crash in these scenarios, and non-super-admins will still see their dashboards, albeit without the pipeline preview cards.

      To work around this scenario, it is important to be careful of the number of super-admin users with open dashboards.

    • Breaking Change: The above-mentioned --concurrent-request-limit flag replaces the --disable-list-all-jobs flag introduced in v5.2.8 and v5.5.9. To get consistent functionality, change --disable-list-all-jobs to --concurrent-request-limit ListAllJobs:0 in your configuration.

  • A new login flow for Concourse has been implemented: In the old login flow, Concourse used to take all upstream third party info (ex. github username, teams, etc) figure out what teams you're on, and encode those into your auth token. The problem with this approach is that every time you change your team config, you need to log out and log back in.

    We have revised this flow so instead we use a token directly from dex, the out-of-the-box identity provider that ships with concourse.

    • This new flow introduces a few additional db calls on each request, but mitigations (caching and batching) have been added to reduce the impact.

      Log out and Log In

      You will need to log out and log back in after upgrading. Make sure to sync fly using fly sync -c <concourse-url> before logging in.

  • LIDAR is the new default resource checking component. LIDAR will be the new resource checking component and is replacing the old component 'Radar'.

    With this switch, the metrics pertaining to resource checking have also changed. Please consult the now-updated Metrics documentation and update your dashboards accordingly.

    See the Features - Runtime section of this doc for more info on LIDAR.

  • The flags library has been updated with stricter validation for flags passed via environment variables: It has long been possible to configure Concourse either by passing flags to the binary, or by passing their equivalent CONCOURSE_* environment variables. However, when an environment variable is passed, the flags library Concourse used would treat it as a "default" value -- which is a bug. This fix updates that library to prevent this from happening.

    Essentially, in cases where operators pass invalid configuration via environment variables, Concourse will now complain whereas it didn't before - after this upgrade, that invalid configuration will cause the binary to fail.

  • When looking up credentials, pipeline scoped credential managers are now preferred over global ones.

  • The default for compressing artifacts has been switched back to gzip from zstd. This change is configurable so zstd can continue to be utilized if desired.
  • The flag for configuring the interval at which Concourse runs its internal components has been updated: CONCOURSE_RUNNER_INTERVAL is now CONCOURSE_COMPONENT_RUNNER_INTERVAL.
  • Support has been removed for emitting metrics to Riemann. This decision is part of a strategy to move towards standardizing on OpenTelemetry.
  • All API payloads are now gzipped. This change should both save bandwidth and make the web UI faster.

Web UI

This breaking change affects the Concourse web UI:

  • Updated the Material Design icon library to 5.0.45.
    • Note: Some icons changed names (e.g. mdi-github-circle was changed to mdi-github) so after this update you might have to update some icon: references.

Features

New features and changes in this release:

Fly Commands

These are the new features and changes to fly commands:

  • fly has a new sub-command pin-resource, which will pin a version of a resource when given at least one field of the version to pin to.
  • Added a --team flag to fly commands so that you can run them against different teams that you are authorized to perform actions against, without having to log in to the team with a separate fly target.

    • So far, this flag has been added to:
      • intercept
      • trigger-job
      • pause-job
      • unpause-job
      • jobs
  • Added the ability to enable or disable a resource version from the command line using the new fly commands fly enable-resource-version and fly disable-resource-version.

  • Added a --url flag to fly watch, allowing the ability to copy the URL of a build from your browser and paste it in your terminal to keep watching the build.
  • Added an --all option to the order-pipelines command which will sort pipelines alphabetically.
  • Added the --all flag to the fly pause-pipeline and fly unpause-pipeline commands. This allows users to pause or unpause every pipeline on a team at the same time.
  • An age column has been added to fly workers.
  • Added a last updated column to the output of fly pipelines showing the last date where the pipeline was set or reset.

Resources

These are the new features and changes to Concourse resources:

  • Registry-Image Resource:
    • A content_trust: field has been added to the registry-image-resource. This will allow for the ability to sign your container images with a notary server.

Core Functionality

These are the new features and changes to the core functionality of Concourse:

  • Build re-running has been implemented: This feature allows a new build to run using the exact same set of input versions as the original build. When a build is re-run, it will create a new build using the name of the original build with the re-run number appended to it, (e.g. 3.1 for the first rerun of build 3). There are two ways to re-run a build:
    • Through the web UI on the builds page
    • Through fly rerun-build
  • Introduced a components table in order to better synchronize all the internal processes that run on the web nodes: This enhancement should help reduce the amount of duplicated work when running more than 1 ATC, as well as decrease db load. There is no configuration required to take advantage of these new improvements.
  • API endpoints have been changed to use a single transaction per request, so that they become "all or nothing" instead of holding data in memory while waiting for another connection from the pool. In the past, this could lead to snowballing and increased memory usage as requests from the web UI (polling every 5 seconds) piled up.
  • Introduced a new set_pipeline step: The set_pipeline step allows a build to configure a pipeline within the build's team. The set_pipeline step can be used as a simpler alternative to the concourse-pipeline resource, with the key difference being that the set_pipeline step does not need any auth config.

    • The set_pipeline step supports vars within its plan configuration (the file:, vars:, and var_files: fields).

    Experimental

    This is an experimental feature and not recommended for production use at this point.

  • Introduced a new load_var step: The load_var step can be used to load a value from a file at runtime and set it in a local var source so that later steps in the build may pass the value to fields like params. With this primitive, resource type authors will no longer have to implement two ways to parameterize themselves (i.e. tag and tag_file). Resource types can now implement simpler interfaces which expect values to be set directly, and Concourse can handle the busywork of reading the value from a file.

    Experimental

    This is an experimental feature and not recommended for production use at this point.

  • Added support for var_sources in the pipeline config: This features allows Concourse to fetch secrets from multiple independent credential managers per pipeline.

    Experimental

    This is an experimental feature and not recommended for production use at this point.

  • Added the ability to tune the mapping between API actions and roles via the --config-rbac flag: While you cannot yet create your own roles, you can customize the built-in ones by promoting and demoting the roles to which certain API actions are assigned.

  • Credentials fetched from a credential manager will now be automatically redacted from build output.

    Opt In

    This feature has to be opted into. For instructions on how to opt-in, see the following docs.

  • Concourse team roles can now be assigned to different CF space roles independently.

    Example

    You can now create role mappings such as "auditors in my CF space should be viewers in my Concourse team", whereas before you could only assign Concourse roles to CF developers.

  • Implemented an optimization which should lower the resource checking load on some instances: Instead of checking all resources, only resources which are actually used as inputs will be checked.

  • Implemented a way for put steps to automatically determine the artifacts they need by configuring inputs: detect: With detect, the step will walk over its params and look for paths that correspond to artifact names in the build plan (e.g. tag: foo/bar or repository: foo). When it comes time to run, only those named artifacts will be given to the step, which can negate a lot of time transferring artifacts the step does not need.
  • Added experimental support for exposing traces to Jaeger or Stackdriver: With this feature enabled (via --tracing-(jaeger|stackdriver)-* variables in concourse web), the web node starts recording traces that represent the various steps that a build goes through, sending them to the configured trace collector.
  • Added tracing to the LIDAR component: A single trace will be emitted for each run of the scanner and the consequential checking that happens from the checker. The traces will allow for more in depth monitoring of resource checking through describing how long each resource is taking to scan and check.
  • When distributed tracing is configured, Concourse will now emit spans for several of its backend operations, including: resource scanning, check execution, and job scheduling. These spans will be appropriately linked when viewed in a tracing tool like Jaeger, allowing operators to better observe the events that occur between resource checking and build execution.
  • When distributed tracing is configured, all check, get, put, and task containers will be run with the TRACEPARENT environment variable set, which contains information about the parent span following the w3c trace context format: TRACEPARENT=version-trace_id-parent_id-trace_flags. Using this information, a user's tasks and custom resource_types can emit spans to a tracing backend, and these spans will be appropriately linked to the step in which they ran. This can be particularly useful when integrating with downstream services that also support tracing.
  • Added tracing to allow users and developers to observe volume streaming from source to destination volumes.
  • Added spans for the load_var and set_pipeline steps when distributed tracing is enabled.
  • The core functionality for archiving pipelines has been implemented: Archiving a pipeline will soft-delete a pipeline while preserving its data for potential use at a later date. Similar to paused pipelines, no resource checking or job scheduling is performed for archived pipelines. Build logs are kept, but remain subject to the configured build log retention policy. Further considerations:

    • Unlike paused pipelines, archived pipelines will have their configuration stripped out so that sensitive information is not stored forever.
    • Unlike paused pipelines, new builds cannot be created for archived pipelines.
    • Archived pipeline names exist in the same namespace as unarchived pipelines. Configuring a new pipeline with the same name as an archived pipeline un-archives the pipeline and gives it a new configuration.

      Note

      Archived pipelines are neither visible in the web UI nor in fly pipelines.

      Note

      Archiving a pipeline will nullify the pipeline configuration. If you downgrade the version of Concourse, unpausing a pipeline that was previously archived will result in a broken pipeline. To fix that, set the pipeline again.

  • When the scheduler tries to start a build with a version that does not exist, it will print an error message to the pending preparation build page. This should help give visibility into why a build is stuck pending.

  • When the scheduler starts a build, it will send a notification to the build tracker to run it. Without this notification from the scheduler to the build tracker, it can take up to approximately 10 seconds before your build gets run after being in a started state.
  • Conjur has been added as a supported credential manager.
  • Added support for Microsoft login via dex.
  • Added support for Vault namespaces which should make managing secrets easier.
  • The cluster name can now be added to each and every log line with the --log-cluster-name flag, available on the web nodes. This flag can be used in a scenario where you have multiple Concourse clusters forwarding logs to a common sink and have no other way of categorizing the logs.
  • Enabled some useful new metrics to be emitted when LIDAR is enabled. These include:
    • The size of the check queue
    • The number of checks queued per ATC each tick.
    • The number of checks garbage-collected at a time
    • Checks started
    • Checks finished
  • Added a global configuration to override the check interval for any resources that have been configured with a webhook token.
  • Prometheus and NewRelic can now receive LIDAR check-finished events.
  • Added proxy support for NewRelic emitter.
  • Job label for build_duration metrics are now exported to Prometheus.
  • Added a metric for the amount of tasks that are currently waiting to be scheduled when using the limit-active-tasks placement strategy.
  • Concourse will now bcrypt client secrets in the db.
  • Added the ability to pin a resource to different version without unpinning it first.
  • Path templating for secret lookups in Vault credential manager is now supported. Previously, pipeline and team secrets would always be searched for under "/prefix/TEAM/PIPELINE/" or "/prefix/TEAM/", where users could customize the prefix but nothing else. Now users can supply their own templates if their secret collections are organized differently, including for use in var_sources.
  • Added a feature to stop the ATC from attempting to renew Vault leases that are not renewable.
  • Added support for a ?title= query parameter on the pipeline/job badge endpoints.
  • Added minimum_succeeded_builds to the build log retention on the job config. This feature will ensure the build reaper keeps around logs for N successful builds.
  • Garden client HTTP timeout is now configurable.
  • Added a feature for the NewRelic emitter to batch emissions and logging info for non-2xx responses from NewRelic.

Runtime

These are the new features and changes to the Concourse runtime:

  • Introduced a new method of resource checking, LIDAR: The entire system has been redesigned to be asynchronous. However, this changed should not have any effect on existing workflows. Outside of a small change to the command output, fly check-resource and fly check-resource-type will continue to work as expected. In addition, a --async flag can now be specified if you do not want to wait for the check to finish.

    • Added a rate limit to resource checking in order to help spread out the rate of checks: This rate limit is defaulted to be determined by calculating the number of checkables (resources and resource types) that need to be checked per second in order to check everything within the default checking interval. This means that if there are 600 checkables and the default checking interval is 60 seconds, Concourse would need to run 10 checks per second in order to check everything in 60 seconds. This rate limit of checks can be modified through the max-checks-per-second flag to be a static number and also turned off by setting it to -1.

    DB Scaling and Garbage Collection

    Concourse performs a large number of checks. As checks are now being stored in the db, this table will probably grow quite quickly. By default checks get garbage-collected every 1 minute, but this interval can be configured by specifying a CONCOURSE_GC_CHECK_RECYCLE_PERIOD.

    To reduce the number of checks that happen, you can utilize the webhook endpoint to trigger checks from external sources. This allows you to significantly reduce the check_every interval (which defaults to 1m) for your resource without impacting the time it takes to schedule a build.

  • Concourse now garbage collects worker containers and volumes that are not tracked in db: In some niche cases, it was possible for containers and/or volumes to be created on the worker, but the db (via the web) assumed their creation had failed. If this occurred, these untracked containers could pile up on the worker and use resources. This new feature ensures that they get cleaned appropriately.

  • This release introduces the first iteration of the containerd backend. This is an experimental backend and not yet recommend for customer use.

    • Added support to the experimental containerd worker backend to leverage the worker's DNS proxy to allow name resolution even in cases where the worker's set of nameservers are not reachable from the container's network namespace (for instance, when deploying Concourse workers in Docker, where the worker namerserver points to 127.0.0.11, an address that an inner container wouldn't be able to reach without the worker proxy).

    Experimental

    This is an experimental feature and not recommended for production use at this point.

  • Updated the way that hijacked containers get garbage collected: Concourse no longer relies on garden to clean up hijacked containers. Instead, this functionality has been implemented in Concourse itself, making it much more portable to different container backends.

  • Update the way that containers associated with failed runs get garbage collected: Containers associated with failed runs used to sit around until a new run was executed. They now have a max lifetime (default - 120 hours), configurable via the failed-grace-period flag.
  • Added a web runtime flag CONCOURSE_SECRET_CACHE_DURATION_NOTFOUND to set a separate caching interval when a secret is not successfully found in the config store. The interval is set to 10s by default.
  • Added a 5 minute timeout for baggageclaim destroy calls.
  • Added a 5 minute timeout for worker's garden client http calls.
  • When the web node is instructing a worker to create a container, any logs emitted will now mention that worker's name.
  • Updated worker heartbeat log level from debug to info in order to reduce extraneous log output for operators.
  • Added a new flag (CONCOURSE_CONTAINER_NETWORK_POOL) to let you configure the network range used for allocating IPs for the containers created by Concourse. This is primarily intended to support the experimental containerd worker backend. Despite the introduction of this new flag, CONCOURSE_GARDEN_NETWORK_POOL is still functional for the (stable and default) Garden worker backend.
  • Added support for the configuration of the set of DNS resolvers to be made visibile (through /etc/resolv.conf) to containers that Concourse creates when leveraging the experimental containerd worker backend.

Web UI

These are the new features and changes to the Concourse web UI:

  • The overall dashboard performance has become more responsive through a series of optimizations:
    • Optimized the ListAllJobs endpoint so that it no longer requires decrypting and parsing the configuration for every single job. This dramatically reduces resource utilization for deployments with a ton of pipelines.
    • Concourse now caches the last-fetched data in local browser storage so that navigating to the dashboard renders at least some useful data rather than blocking on all the data being fetched fresh from the backend.
    • Implemented infinite scrolling and lazy rendering, which should greatly improve performance on deployments with hundreds of pipelines configured.
    • Improved the initial page load time by lazy-loading Javascript that isn't necessary for the first render.
    • The dashboard will no longer "pile on" requests to a slow backend. Previously, if the web node was under too much load, it could take longer to respond to the ListAllJobs endpoint than the default polling interval, and the dashboard could start another request before the last one finished. It will now wait for the previous request to complete before making another.
  • Moved the "pin comment" field in the Resource view to the top of the page (next to the currently pinned version). The comment can be edited inline.
  • The design of the pin menu on the pipeline page now matches the sidebar. In addition the dropdown now toggles when clicking the pin icon.
  • The build page now shows text labels for different step types like: get:, task: and set_pipeline:, instead of the icons from previous versions.
  • The resource metadata displayed in a get step on the build page has been restyled, making it easier to read and follow.
  • The Material Design icon library has been updated so now the concourse-ci icon is available for resources.
  • Enlarged the build prep list font to match the other build log output styling.
  • Added a loading indicator on the dashboard while awaiting initial API/cache responses.

Installation

These are the new features and changes to the Concourse installation:

  • Added CONCOURSE_GARDEN_NETWORK_POOL as configurable flag in the BOSH release. This defaults to Garden's range of 10.254.0.0/22
  • added CONCOURSE_GARDEN_MAX_CONTAINERS as configurable flag in the BOSH release. This defaults to 250.

    Warning

    Setting this limit over 250 has not been formally tested by the Garden team or the Concourse team.


Resolved Issues

This release fixes the following issues:

Resolved Fly Command Issues

These are the fixes to fly commands:

  • Fixed an issue with fly login where Safari would block tokens from being transferred to fly.
  • The fly set-team documentation when running --help previously suggested that a list is a valid input to any auth configuration flags. This doesn't mean you can supply a comma-separated list to the flag, rather that the flag can be provided multiple times. The fly set-team help documentation now reflects this.
  • Fixed a bug where fly builds would show the wrong duration for cancelled builds.
  • Fixed an issue where fly workers would show the wrong age for a worker if that worker was under an hour old.
  • When specifying a specific version on a get step, fly now validates that only string values (no nested YAML) are allowed.
  • The fly format-pipeline now always produces a formatted pipeline, instead of declining to do so when it was already in the expected format.
  • Fixed a regression with fly sync where when running it displayed the following warning message warning: failed to parse Content-Length: strconv.ParseInt: parsing "": invalid syntax along with an incorrect progress bar.
  • Changed the behaviour of fly set-team so that when a role has no groups or users configured, it no longer raises an error.
  • Fixed a bug where fly would no longer tell you if the team you logged in with was invalid.
  • fly sync no longer requires a target to be registered beforehand; instead, a --concourse-url (or -c) flag may be specified.
  • fly validate-pipeline will now work when given a pipeline config which uses var_sources.
  • Improved the error that fly reports when your .flyrc has invalid YAML.
  • Changed the Concourse CLI to output help text on stdout when the -h or --help flag is passed. This makes it easier to use other tools like grep to find relevant parts of the usage text.
  • Fixed a x509 issue that occurred when a super admin logged in without CACert after their first successful login. Now when a super admin logs in without providing the CACert after the first successful login, the CACert will be loaded from .flyrc.

Resolved Resources Issues

These are the fixes to Concourse resources:

  • Semver Resource:
    • Fixed an issue with the git driver. Now the resource will create the file: specified in the source configuration if it doesn't already exist exist.
    • Fixed an issue with the git driver where it would go into an infinite loop when git push failed.
  • Git Resource:
    • Fixed an issue where the git resource would display too many commits when paths: was specified.
    • Fixed an issue where the version order was incorrect when using paths.
  • Registry-Image Resource
    • Fixed a bug where get steps would give a 404 error.
    • Bumped the resource to v0.8.2, which should resolve DIGEST_INVALID errors introduced by faulty retry logic. Additionally, the resource will now retry on 429 (Too Many Requests) errors from the registry, with exponential back-off up to 1 hour.

Resolved Core Functionality Issues

These are the fixes to the core functionality of Concourse:

  • The implementation of the new algorithm fixes an edge case where multiple resources with corresponding versions (e.g. a v1.2.3 semver resource and then a binary in S3 corresponding to that version) are correlated by virtue of being passed along the pipeline together. When one of the correlated versions was disabled, the old algorithm would incorrectly continue to use the other versions, matching it with an incorrect version for the resource whose version was disabled. Because the new algorithm always works by selecting entire sets of versions at a time, they will always be correlated, thus eliminating this issue.
  • Fixed an issue where builds could get stuck in pending state for jobs that are set to run serially. Because of the new algorithm, builds should never be stuck in a scheduled state because of it's serial configuration.
  • The db will now use a version hash to look up resource caches in order to speed up any queries that reference resource caches.
  • Fixed a bug where the existence of missing volumes that had child volumes referencing it was causing garbage collecting all missing volumes to fail.
  • Fixed a bug where /opt/resource/out scripts in resources could crash web nodes by outputting null to stdout, causing a nil pointer dereference.
  • Fixed a bug introduced in v5.5.0 where Prometheus metrics would get clogged up with data about workers that were no longer registering.
  • Improved the way auth config for teams is validated. Now operators cannot start a web node with an empty --main-team-config file, and fly set-team will fail if it would result in a team with no possible members. This prevents scenarios where users can get accidentally locked out of Concourse.
  • Corrected an issue where secret redaction incorrectly "redacts" empty string resulting in odd looking logs.
  • Fixed an issue secret redaction wherein a secret containing e.g. { on its own line (i.e. formatted JSON) would result in { being replaced with ((redacted)) in build logs. Single-character lines will now be skipped.
  • Added support for AWS SSM for var_sources.
  • Because of the removal of 'Radar', the --resource-type-checking-interval flag has been removed.
  • Improved the output returned by Concourse for errors, which should help resource authors better understand the errors being returned.
  • Added some lock types that weren't getting emitted as part of our metrics.

    Note

    Lock metrics may increase because of this change, but it's expected behaviour.

  • Added ability to configure NewRelic insights endpoint which allows the use of EU or US data centers.

  • Fixed a bug with the NewRelic emitter that would display a "invalid memory address or nil pointer dereference" error.
  • Fixed a bug where vault users that hadn't configured a shared path would end up searching the top level prefix path for secrets.
  • Fixed a bug with the builds api where it would return the wrong builds if you gave it a date newer than the most recent build.
  • Fixed a race condition resulting in a crash with LIDAR.
  • Fix a bug where when --log-db-queries was enabled only part of db queries were logged. Expect to see more log outputs when using the flag now.
  • Fixed a bug that crashed web node when renaming a job with old_name equal to name.
  • Concourse used to check the existence of legacy migration table by accessing information_schema and parsed out the error message does not exist in plain text. This has been changed to using to_regclass in postgres 9.4+ to check if the table exists. It will now return NULL or the table name.
  • Fixed a bug where the concourse_workers_registered metric would never go below 1, even when workers were pruned.
  • Fixed a migration issue from v5.4.0 that affected users who had old, unused resources left over from older versions of Concourse.

Resolved Runtime Issues

These are the fixes to the Concourse runtime:

  • The length of time that the history of a resource check is retained was changed from 6 hours to 1 minute. This change was made because the 6 hour default might cause slowness for large deployments.
  • Previously, the build tracker would unconditionally fire off a goroutine for each in-flight build (which then locks and short-circuits if the build is already tracked). This has been changed so that the build tracker will only fire off a goroutine if one does not already exist.
  • Added a fix to transition failed state containers to destroying which results in them being garbage-collected. This ensures that if web's call to garden to create a container times out, the container is subsequently deleted from garden prior to being deleted from the db. This keeps both the web's and worker's state consistent.
  • Fixed a bug where task caches were getting garbage collected every time a pipeline was set.
  • Previously, if a worker stalled, the ATC would still countdown and remove any 'missing' containers. If the worker ever came back it would still have these containers, but we would not longer be tracking them in the db. The ATC will now no longer delete container rows when a worker is in a stalled state.
  • Changed the behaviour of the web to retry individual build steps that fail when a worker disappears.
  • Fixed a bug for jobs that have any type of serial groups set (serial: true, serial_groups, or max_in_flight). Whenever a build for that job would be scheduled and Concourse would check if the job has hit max in flight, it would unnecessarily recreate all the serial groups in the db.
  • Concourse now Explicitly whitelists all traffic for Concourse containers in order to allow outbound connections for containers on Windows.
  • A default value of 4 hours has been set for rebalance-interval. Previously, this value was unset. With the new default, the workers will reconnect to a randomly selected TSA (SSH Gateway) every 4 hours.
  • Fix a bug where a task's image or input volume(s) were redundantly streamed from another worker despite having a local copy. This would only occur if the image or input(s) were provided by a resource definition.
  • Previously, aborting a build could sometimes result in an errored status rather than an aborted status. This happened when step code wrapped the err return value, fooling Concourse's == check. Concourse now use errors.Is (new in Go 1.13) to check for the error indicating the build has been aborted. Now the build should be correctly given the aborted status even if the step wraps the error.
  • Enhanced task step vars to support interpolation.
  • Corrected the DNS proxy used by workers when running in Docker to compress the response message sent to the client.
  • Fixed an issue where if fail_fast for in_parallel is true, a failing step causes the in_parallel to fall into on_error.
  • Removed superfluous mentions of register-worker from TSA.

Resolved Web UI Issues

These are the fixes to the web UI of Concourse:

  • Fixed a bug where the job page would show an indefinitely loading spinner when there were no builds.
  • Corrected an issue where the tooltip that says 'new version' on a get step on the build page could be hidden underneath the build header.
  • In the case that a user has multiple roles on a team, the pills on the team headers on the dashboard now accurately reflect the logged-in user's most-privileged role on each team.
  • Fixed the jagged edges on the progress bar indicators used by the dashboard.
  • Fixed a bug where log lines on the build page would have all their timestamps off by one.
  • Teams are now sorted alphabetically by team name, making it easier to search for a specific team.
  • Fixed an issue where the build page would scroll to the top when highlighting a specific line. The highlighted line will now be fixed.
  • Added a tweak the UI on the resource page: when a version is pinned, rather than cramming the pinned version into the header, the "pin bar" for the version will now replace the "checking successfully" bar, since pinning ultimately prevents a resource from checking.
  • Improved consistency of auto-scrolling to highlighted logs.
  • Fixed rendering pipeline previews on the dashboard on Safari.
  • Fixed log highlighting on the one-off-build page. Previously, highlighting any log lines would cause the page to reload.
  • Fixed scrolling through the build history list on Firefox.
  • Fixed a bug where your dashboard search string would end up with +s instead of spaces when logging in.