11  Propagation Flowcharts

This page offers a high-level overview of the development and execution workflows for both openEO and OGC Application Packages. It helps the reader understand how these workflows compare, focusing on key concepts without delving too much into technical details or edge cases.

11.1 OGC Application Package

For more information on OGC Application Packages, CWL, and related workflows, explore the following resources that are made available by Terradue:

Warning

The following chapters represent the flows like they are executed on the platforms provided by Terradue.

11.1.1 Development Flow

Development begins with Alice (AP-UC1), who has experience with container solutions, such as Docker, and initiates the code integration into CWL. Alice could be a scientist or a computer scientist with general technical skills.

The following sequence diagram outlines a typical development flow for OGC Application Packages. The main steps are:

  1. Prepare container images: Each image executes a specific workflow step and contains the necessary dependencies.
  2. Test in Docker: Container images are tested in a Docker environment to ensure they are working.
  3. Create CWL Documents and Workflow: Using CWL tools, a CWL workflow is created, defining the processing steps and metadata.
  4. Test the CWL Workflow: The CWL workflow is tested in a Docker/k8s environment.
  5. Share the Application Package: The final CWL workflow, now validated, is shared by distributing the CWL workflow definition (YAML format).

The result is a CWL workflow definition that includes the workflow and metadata. You can explore an example here. This CWL workflow is considered an OGC Application Package as it complies with the AP best practice.

More detailed execution scenarios are available at:

sequenceDiagram
    actor Alice
    participant IDE as IDE Service
    participant Local_Env as (local) Docker
    participant GitHub
    participant Container_Registry as Container Registry

    Note over Alice,IDE: Preparing container images 
    Alice->>IDE: Prepare container images
    Alice->>IDE: Test container image
    IDE->>Local_Env: Start container image
    Local_Env->>Local_Env: Execute container
    Local_Env-->>Alice: Return results of execution

    Note over Alice,IDE: Creating CWL Workflow
    Alice->>IDE: Create CWL workflow
    Alice->>IDE: Test CWL workflow
    IDE->>Local_Env: Start CWL workflow
    Local_Env->>Local_Env: Execute CWL workflow
    Local_Env-->>Alice: Results of CWL workflow test
    Alice->>GitHub: Publish CWL on GitHub
    GitHub->>Container_Registry:Push containers to registry

11.1.2 Hosting Flow

While CWL workflows can be executed independently using CWL tooling, they require an environment that supports container execution and orchestration. For large-scale production or specific infrastructure needs, platform support often becomes essential. Providers like Terradue offer hosting solutions to simplify the process by offering CWL workflows as OGC API Processes.

In this flow, either Alice or Bob can initiate the hosting process. Bob may have received the CWL workflow from Alice or discovered it through an online resource (e.g., GitHub). In both cases, Alice or Bob could choose to:

  • Expose the workflow as a service (OGC API Process) for their project team or a broader audience. (AP-UC2)
  • Use the workflow for (temporary) production campaigns. (AP-UC3)

Alternatively, Bob may discover an existing service on platforms like GEP or NoR but may want to set up his own version tailored to specific infrastructure needs (AP-UC4). This could be the case if Bob requires the service for a large-scale production campaign.

Finally, both Alice and Bob might consider onboarding their algorithm onto the ESA Network of Resources. This increases the range of their service, allowing a broader audience to use their service at a predefined cost (AP-UC5).

The hosting flow is summarized in these steps:

  1. The user requests support from Terradue for AP execution.
  2. Terradue provisions the required infrastructure based on the user’s needs.
  3. A cost model is established with the user, including a contract and Service Level Agreement (SLA).
  4. The AP is deployed within the OGC Process API namespace by either Terradue or the user.
  5. The OGC API Process is registered on ESA NoR, making it accessible for broader. Costs are covered by the cost model that is linked to this hosting instance.

sequenceDiagram
    actor Alice as Alice / Bob
    actor Provider as Terradue
    participant API as OGC Processes API
    participant Infrastructure
    participant NoR

    Alice->>Provider: Request service deployment
    Alice->>Provider: Define specifications for the infrastructure
    Provider->>API: Setup namespace
    Provider->>Infrastructure: Configure resources
    Provider->>Alice: Agree on cost model (contract and SLA)
    alt Deployment by provider
        Alice->>Provider: Send the AP
        Provider->>API: Deploy AP
    else Deployment by user
        Alice->>API: Deploy AP
    end
    Provider->>NoR: Register AP on the NoR

11.1.3 Execution Flow

Once the workflow is onboarded onto ESA NoR, a third user, Carol, may discover (AP-UC6) and want to execute it. Carol could be part of the project team or simply find the service through an online platform such as NoR or GEP. Alternatively, Carol could have received the OGC API Processes URL from Bob and/or Alice.

Before executing the workflow, Carol needs to ensure sufficient funds are available, potentially utilizing the NoR sponsoring mechanism to secure these resources.

The overall execution flow is illustrated in the following diagram.

sequenceDiagram
    actor Alice as Carol
    participant Provider as Terradue
    participant API as OGC Processes API
    participant Infrastructure
    participant Catalogue as NoR/GEP/...
    participant NoR
    
    Alice->>Catalogue: Discovers the OGC API Process
    Alice->>NoR: Request sponsoring for execution
    Note over Alice,NoR: After completing NoR procedures
    Alice->>Provider: Create account
    Alice->>API: Execute process
    API->>Infrastructure: Execute workflow
    Infrastructure-->>API: Return result
    API-->>Alice: Return result

11.2 openEO

For more information on openEO, check out:

Warning

The following chapters represent the flows like they are executed on the platforms provided by VITO, such as the CDSE.

11.2.1 Development Flow

The creation of an openEO processing graph begins with Alice, a researcher experienced in openEO concepts. She builds her workflow by combining predefined openEO processes with user-defined functions (OPENEO-UC1).

To start, Alice selects an openEO backend, like CDSE, that provides access to the data she intends to process. Once her process graph is complete, she packages it into a user-defined process (UDP) (OPENEO-UC2), enabling it to be executed by other project members or the broader community. As a final step, Alice may onboard the service onto the ESA NoR, establishing a cost model for service execution (OPENEO-UC3).

The development flow for creating and sharing an openEO UDP is outlined below:

  1. Create an Account: Alice creates an account with an openEO API provider, such as CDSE.
  2. Create the Process Graph: She develops a processing graph that represents her workflow.
  3. Test the Process Graph: Alice tests the graph on her selected openEO backend.
  4. Encapsulate as a UDP: The process graph is then encapsulated as a UDP, complete with metadata.
  5. Test the UDP: She tests the UDP on one or more openEO backend to validate functionality.
  6. Share the UDP: Once validated, the UDP is shared as a JSON definition for broader use.
  7. Register the UDP on ESA NoR: Alice defines a cost model, making the service available on the ESA Network of Resources.

An example UDP definition can be found here.

sequenceDiagram
    actor Alice
    participant IDE as IDE Service
    participant openEO as CDSE
    participant GitHub
    participant NoR
    
    Alice->>openEO: Create account

    Note over Alice,IDE: Creating the openEO process graph
    Alice->>IDE: Create process graph
    Alice->>IDE: Test process graph
    IDE->>openEO: Execute the process graph
    openEO->>openEO: Calculate the result
    openEO-->>Alice: Return result

    Note over Alice,IDE: Creating the openEO UDP
    Alice->>IDE: Create UDP from process graph
    Alice->>IDE: Test UDP
    IDE->>openEO: Execute the UDP process graph
    openEO->>openEO: Calculate the result
    openEO-->>Alice: Return result
    Alice->>GitHub: Publish UDP on GitHub
    
    opt 
        Note over Alice,NoR: Onboarding of the UDP on NoR
        openEO->>Alice: Agree on cost model (contract and SLA)
        openEO->>NoR: Register UDP on the NoR
    end

If Alice wants to test out the UDP on an openEO backend that is different from CDSE, e.g. Terrascope, we get the following diagram.

sequenceDiagram
    actor Alice
    participant IDE as IDE Service
    participant openEO as Terrascope
    participant GitHub
    participant NoR
    
    Alice->>openEO: Create account
    Alice->>IDE: Test UDP
    IDE->>openEO: Execute the UDP process graph
    openEO->>openEO: Calculate the result
    openEO-->>Alice: Return result

11.2.1.1 Execution Flow

Once the UDP is onboarded onto ESA NoR, a third user, Carol, may discover (OPENEO-UC4) and want to execute it or use it do large scale processing (OPENEO-UC5). Carol could be part of the project team or simply find the service through an online platform such as NoR or GEP. Alternatively, Carol could have received the UDP URL from Alice.

Before executing the workflow, Carol needs to ensure sufficient funds are available, potentially utilizing the NoR sponsoring mechanism to secure these resources or using the free credits that are made available on CDSE.

The overall execution flow is illustrated in the following diagram.

sequenceDiagram
    actor Alice as Carol
    participant Provider as CDSE
    participant Infrastructure as Cloudferro/OTC

   
    Alice->>NoR: Discovers the UDP
    opt
        Alice->>NoR: Request sponsoring for execution
        Note over Alice,NoR: After completing NoR procedures
    end 
    Alice->>Provider: Create account 
    Alice->>Provider: Execute UDP
    Provider->>Infrastructure: Execute workflow
    Infrastructure-->>Provider: Return result
    Provider-->>Alice: Return result  

11.3 APEx Services Scope (suggestion)

11.3.1 Algorithm Porting

11.3.1.1 OGC AP/API Processes

  • Supporting users in packaging as OGC AP (AP-UC1), resulting in a CWL Workflow.

11.3.1.2 openEO UDP

  • Support users in creating process graph from existing algorithm (OPENEO-UC1)
  • Support users in encapsulating procress graph into UDP (OPENEO-UC2)

11.3.2 Algorithm Onboarding

11.3.2.1 OGC AP/API Processes

  • Deployment of the CWL workflow to a hosting platform (AP-UC2), resulting in an OGC API Process.
  • Deployment of an existing service with customized specifications (AP-UC4 or consider this as upscaling?), resulting in a new OGC API Process.
  • Provide guidelines or support for creating OGC API Record for catalogue integration to ensure the service is discoverable (AP-UC6, see next section)
  • Provide guidelines for setting up validation benchmarks
  • Support the onboarding of the service onto the NoR (AP-UC5 - maybe covered by between provider and platforms?)
11.3.2.1.1 Catalogue registration

Option 1 - Registration of only the CWL workflow in APEx(result of AP-UC1)

  • Eliminates additional moderation as one CWL could be hosted multiple times with different cost models.
  • If a user want to execute this CWL, they would need to go through the hosting and execution flow.
  • The only way for APEx to indicate if this algorithm is working properly is by testing it in an APEx-dedicated hosting environment.
  • How can we make clear to users on which platforms this CWL can be hosted? Or can CWLs always be hosted on any platform supporting OGC AP/API Processes?

Example OGC API Record structure:

{
    ...
    "links": [
        {
          "rel": "application-package",
          "type": "application/json",
          "title": "CWL workflow",
          "href": "<LINK TO CWL workflow>"
        }
      ]
}

Option 2 - Registration of the OGC API Process endpoint (result of hosting workflow)

  • If the user wants to execute the service, they would need to go through the execution flow.
  • Can also include link to the CWL if users want to use it in their own environment. If the user wants to deploy their own version or upscale, they can go through the hosting flow. This can be added through an information box, directly linking to the corresponding APEx services.
  • APEx can test the service by requesting NoR funds for the service.
  • Could require additional moderation as the same service could be available with different cost models? Could be defined using multiple entries in a single record?

Example OGC API Record structure:

{ 
    ...
    "links": [
        {
          "rel": "service",
          "type": "application/json",
          "title": "<NAME OF THE SERVICE/PLATFORM>",
          "href": "<LINK TO OGC API PROCESS ENDPOINT>"
        },
        {
          "rel": "application-package",
          "type": "application/json",
          "title": "CWL workflow",
          "href": "<LINK TO CWL workflow>"
        },
         {
          "rel": "nor",
          "type": "text/html",
          "title": "NoR Service",
          "href": "<LINK TO NOR service>"
        },
        {
          "rel": "license",
          "type": "application/pdf",
          "title": "End-User License Agreement",
          "href": "https://geohazards-tep.eu/downloadFiles/EULA-GEP.pdf"
        }
      ]
 }

11.3.2.2 openEO UDP

  • Support of the onboarding onto the NoR (OPENEO-UC3 - Tools to support cost model definition, …, maybe covered by the platforms?)
  • Provide guidelines and support for creating OGC API Record for catalogue integration to ensure the service is discoverable (OPENEO-UC4)
  • Provide guidelines for setting up validation benchmarks

11.3.3 Algorithm Upscaling

11.3.3.1 OGC AP/API Processes

  • Support with the execution of production campaigns (AP-UC3)

11.3.3.2 openEO UDP

  • Support and tools for large scale data processing (OPENEO-UC5)