Contents -

1.Introduction

The Data Gateway solution is a robust and versatile product crafted to streamline secure and efficient data exchange between trading partners, applications and cloud services. The Data Gateway, underpinned by the Spring Boot framework, functions as a central hub orchestrating authentication, file transfers, and cloud data transactions. This documentation serves as a comprehensive guide, offering insights into comprehending, implementing, and optimizing the multifaceted functionalities of the Data Gateway server. 

2.About Data Gateway

Data Gateway serves as a cornerstone for centralized and secure file transfers, ensuring the confidentiality and integrity of sensitive data. Tailored for Super Admins, Admins, and File Operators, this application simplifies the intricacies of secure file exchange. Distinguished by its support for widely used protocols like SFTP (SSH File Transfer Protocol), FTP (File Transfer Protocol), and FTPS (FTP Secure), the Data Gateway focuses on delivering seamless and secure data transmission. The gateway has Secure Component sitting in the DMZ and rest of the layers in the secure zone with no internet access. The secure components take care of the Forward and Reverse Proxy functionalities, and all the connections are securely transmitted. Organizations benefit from the flexibility to choose the most suitable and secure protocol tailored to their specific use cases. Explore the features, configurations, and best practices outlined in this documentation to harness the full potential of Data Gateway for your data exchange needs. 

3. Key Features

  1. Secure File Transfer: Utilize industry-standard protocols and cloud-native solutions for secure and encrypted file transfers. Encapsulated with Forward, Reverse Proxy and File Validation, Scanning capabilities. 
  2. Role-Based Access: Different access roles (Super Admin, Admin, File Operator) ensure controlled access and tailored functionalities. 
  3. Comprehensive Dashboard: Gain insights into file transactions with detailed statistical reports, offering a clear overview of activities. 
  4. Cloud Configurations: Seamlessly configure cloud providers, allowing flexibility in storing partner-related keys/passwords. 
  5. Access Management: Empower Super Admins and Admins with the ability to create, update, or delete users and manage trading partner buckets. 
  6. Trading Partner Module: Set up partners with ease, choosing from various protocols (SFTP, FTP, FTPS) and cloud storages (GCS, S3, Azure, IBM-Object-Storage). 
  7. File Operator Module: Search, upload, and download files directly from assigned buckets, providing efficient file management. 
  8. Settings Module: Configure schedulers in the Polling Interval submodule for automated tasks. 
  9. File Replay: Restore deleted files and reprocess them from clouds to servers with the File Replay feature. 

4. Glossary

PartnerExternal entities who send/receive files.
EndpointConnections for transferring files.
GatewayOrganization/Data Gateway Server.
Push to GatewayPartner connects to Data Gateway Server and drops the file.
Push to PartnerData Gateway connects to Partner’s remote server and drops the file.
Pull from PartnerData Gateway connects to Partner’s remote server and pulls the file.
Pull from GatewayPartner connects to Data Gateway Server and pulls the file.
Pickup Directory

The directory where the Client will pick up the files.

Used for Pull from Partner and Push to Gateway.

Drop Directory

The directory where the Client will drop the files.

Used for Push from Hub and Push from Partner.

 

5. System Requirements

The Data Gateway Application is supported with the following System configurations: 

 Minimal Recommended 
Operating System Linux (RHEL) 8 Linux (RHEL) 9 
RAM 8 GB 8 GB 
Disk 50 GB 100 GB 
CPU 4 cores 8 cores 

 

6. Application Access

 

The Data Gateway UI URL can be accessed once the deployment is complete using the hostname provided in the deployment configuration. It prompts you for the login credentials. Upon first login, the Administrator can login with the factory credentials [Expl0re@123], which must be changed through Access Management after the initial login. This is the process of the local authentication, where the credentials are stored within the Data Gateway. 

The application can also be deployed using a SAML profile, allowing authentication to be handled via SAML. In this process, the credentials are stored in Identity Provider (IDP) (Example: OKTA) and not in the Data Gateway. 

6.1 Authentication Process

The Data Gateway Application employs a robust two-factor authentication system, in case the application is deployed without SAML for enhanced security. Users need to follow these steps for secure access: 

  1. Enter a valid registered Username and Password in the Data Gateway UI. 
  2. Upon successful login, a unique one-time password (OTP) is sent to the user’s registered email address. 
  3. Input the received OTP to complete the authentication process. 

6.2 Application Swagger UI

Once the Data Gateway Application is deployed, users can access the application APIs through Swagger UI, providing a convenient and interactive interface for exploring and interacting with the application’s endpoints. Swagger UI simplifies the process of understanding and testing different functionalities through the APIs available. 

The APIs are enabled with both basic authentication and Token Authentication.

7. Roles

The Data Gateway Application features role-based access control: 

S NO Role Accessible Modules 
1. Super Admin Dashboard, File Management, End Point Management, Access Management, Alert Management, Cloud Configurations and Settings. 
2. Admin Access Management (Create and Manage User and Group) 
3. File Operator Dashboard and File Management. 
4. File Manager Dashboard, File Management, End Point Management, Alerts Management, Settings. 
5. Business User Dashboard, File Transfers, End Point Management (Manage Endpoint), Alerts Management (Manage Alerts & Alerts Dashboard) 

 

8. Dashboard Reports (Statistics)

The Data Gateway Dashboard is your central hub for monitoring and managing file transactions. The dashboard provides a comprehensive summary of todays and recent transactions, ensuring you stay informed about your data transfer activities. Below is an overview of key metrics:  

In each report, you’ll find the following details: 

  1. Total Files: The overall count of files uploaded/downloaded during the specified time frame. 
  2. Total Size: The total size of files uploaded/downloaded throughout the week, today, or this month (in Bytes, KB, MB, GB, TB…). 
  3. Success: Number of files successfully uploaded/downloaded during the specified time frame (in Bytes, KB, MB, GB, TB…). 
  4. Failure: Number of files that encountered upload/download failures during the specified time frame (in Bytes, KB, MB, GB, TB…). 

These fields are consistent across all reports, providing a standardized view of your data transfer statistics. Stay tuned for real-time updates on your data transfer activities. 

9. Cloud Configurations

Cloud Configurations are a pivotal component of the Data Gateway, providing users with the ability to seamlessly integrate with various cloud providers for efficient storage and retrieval of files. This section guides users through the configuration process for different cloud providers, ensuring a secure and reliable data transfer experience. 

9.1 GCS Cloud Configurations

 

9.1.1 Service account creation

  1. Sign in to [Google Cloud Console] 
  2. Select an existing Project or create Google Cloud Project.  

3.Navigate to IAM & Admin -> “Service Accounts“ 

4. Click “+ CREATE SERVICE ACCOUNT” and fill in the required details. 

5. After creating the service account search with value and open the service account and assign the necessary roles/permissions (e.g., Storage Admin).

6. Create and download a JSON key file.  

7. Go to the KEYS –> ADD KEY 

8. By Clicking on the ADD KEY we can see 

  • Create new key  
  • Upload existing Key 

 

9. We can generate as .JSON or .P12 files as shown below. 

10. Select the Key Type as JSON and click on the Create button we can get one JSON file as shown below. 

Example .json file 

 

9.1.2 Pub/Sub Creation

Select the Pub/Sub category in the Google Cloud account and set up the Topics and Subscriptions

 
9.1.2.1 Topic Creation:
  1. Click on the Topics –> Create Topic, and provide the Topic ID. 

2. When you create or update a topic, you must specify its properties. 

 

  • Add a default subscription. Adds a default subscription to the Pub/Sub topic. You can create another subscription for the topic after the topic is created. The default subscription has the following properties:
    • Subscription ID of -sub 
    • Pull delivery type 
    • Message retention duration of seven days 
    • Expiration after 31 days (about 1 month) of inactivity 
    • Acknowledgment deadline of 10 seconds 
    • Immediate retry policy 
  • Use a customer-managed encryption key (CMEK). Specifies if the topic is encrypted with a CMEK. Pub/Sub encrypts messages with Google-owned and Google-managed keys by default. If you specify this option, Pub/Sub uses the envelope encryption pattern with CMEK. In this approach, Cloud KMS does not encrypt the messages. Instead, Cloud KMS encrypts the Data Encryption Keys (DEKs) that Pub/Sub creates for each topic. Pub/Sub encrypts the messages using the newest DEK that was generated for the topic. Pub/Sub decrypts the messages shortly before they are delivered to subscribers. 

 

9.1.2.2 Create Subscription:
  1. Click on the Subscription –> Create Subscription, provide the Subscription ID 

2. We can use the default existing subscription or create a new one by selecting Create Topic. 

3. If we select the create Topic console will provide another pop up to create topic:

4. Schema. A schema is a format that the message data field must follow. A schema is a contract between the publisher and subscriber that Pub/Sub enforces. Topic schemas help standardize message types and permissions to allow them to be consumed by different teams in your organization. Pub/Sub creates a central authority for message types and permissions.  

 

  • Enable ingestion. Enabling this property lets you ingest streaming data from external sources into a topic so that you can leverage the features of Google Cloud.  
  • Message retention duration. Specifies how long the Pub/Sub topic retains messages after publication. After the message retention duration is over, Pub/Sub might discard the message regardless of its acknowledgment state. Message storage fees are charged for storing all messages published on the topic. 
  • Default = Not enabled 
  • Minimum value = 10 minutes 
  • Maximum value = 31 days (about 1 month) 

 

9.1.3 Bucket Creation

Step 1: Log in to Google Cloud Console
1. Open your web browser and go to Google Cloud Console.
2. Log in with your Google account.

Step 2: Select Your Project 

  1. Click the project selector dropdown at the top.
     
  2. Choose an existing project or create a new project.
     
  3. Ensure billing is enabled for your project. 

Step 3: Navigate to Cloud Storage 

  1. In the left sidebar menu, go to Buckets and Click “Create” at the top.

Step 4: Enter a Unique Bucket Name 

  1. The bucket name must be globally unique across all of Google Cloud. 
  2. It must:
    • Contain only lowercase letters, numbers, dashes (-), and underscores (_). 
    • Be 3-63 characters long. 
    • Not start or end with a dash (-). 
  3. Example: my-gcs-bucket-123

Step 5: Select a Storage Location 

The location determines where your data is stored and affects performance and cost. You have three options: 

  • Region (e.g., us-central1): Data is stored in a single region. 
  • Dual-region (e.g., nam4): Data is replicated in two locations. 
  • Multi-region (e.g., US): Data is stored across multiple locations for higher availability.

Step 6: Choose a Storage Class 

The storage class determines the cost and retrieval speed: 

  • Standard (default) – Best for frequently accessed data. 
  • Nearline – Best for data accessed once a month. 
  • Coldline – Best for data accessed once a year. 
  • Archive – Best for long-term storage (10+ years).

Step 7: Configure Access Control 

You can choose between: 

  • Uniform (Recommended) – Permissions apply at the bucket level. 
  • Fine-grained – Permissions apply to individual objects.

 

Step 8: Set Advanced Settings (Optional) 

8.1. Soft-delete policy (for data recovery) 

  • When enabled, objects deleted from the bucket are not immediately erased. Instead, they are retained for a specified period before permanent deletion. 
  • This allows recovery of mistakenly deleted objects within the retention period. 

Options: 

  • Use default retention duration -> The default retention period is 7 days unless changed by an administrator. 
  • Set custom retention duration -> Allows you to specify a custom duration for how long deleted objects should be retained.

 

8.2. Object Versioning (for version control) 

  • When enabled, Cloud Storage keeps multiple versions of an object, allowing rollback to previous versions. 
  • Useful for protecting against accidental overwrites or deletions. 

Fields: 

  • Max. number of versions per object -> Defines how many previous versions of an object are stored. 
  • Setting this to 1 allows only one previous version (not recommended for full overwrite protection). 
  • Increasing this number ensures more backup versions are kept. 
  • Expire non-current versions after X days -> Automatically deletes older object versions after the specified period. 
  • Helps manage storage costs. 
  • Recommended: 7 days for Standard storage class.

 

8.3. Retention (for compliance) 

  • Prevents objects from being deleted or modified for a specific period. 
  • Useful for regulatory compliance and data protection policies. 

Options: 

  • Set bucket retention policy -> Applies a uniform retention period for all objects in the bucket. 
  • Enable object retention -> Allows retention settings at the individual object level.

Step 9: Enable Encryption (GCP-managed or Customer-managed keys) and Create the Bucket 

  • Default google will provide Google-managed encryption key and Click “Create” to finalize the setup.

 

9.1.4 Google Cloud Configuration in Data Gateway

The Cloud Provider configurations are setup for onboarding the Partners. The Data Gateway allows users to store partner-related keys or passwords in cloud-native secret managers, enhancing security and flexibility. In the absence of specific configurations for cloud secret managers, the keys or passwords are encrypted and stored in the database by default. 

To leverage cloud-native Secret Managers, users need to enable the service by clicking on the ‘Secret Manager’ checkbox. Authentication is then conducted based on the configured settings. It’s important to note that, currently, Data Gateway supports single user accounts for cloud providers. 

 

  1. Navigate to Cloud Configurations. 
  2. Click on add icon (+) that display over GCS Configurations tab. 

3. Select the GCS Storage Service checkbox.  

  1. This option is to enable Google Cloud Storage (GCS) service integration. 
  2. When checked, it allows users to configure storage buckets and Pub/Sub topics in Google Cloud. 
  3. Use this when your service needs to store data in Google Cloud Storage 

4. You can select the GCS Secret Manager Service 

  1. This option is for enabling Google Cloud Secret Manager service. 
  2. It allows users to store, manage, and access sensitive information such as API keys, passwords, or certificates securely. 
  3. Only select this option if the integration is related to secret management instead of database. 

5. Configuration Details: 

  • Storage Service Name: Provide any unique name for the identification of the GCS Storage Service. 
  • Auth File: Users need to provide the Google-Auth.json file. 
  • Project ID: Provide the Project ID which was created in GCS.  
  • Topic Name: Provide the Pub/Sub Topic Name where messages are published whenever an event happens in the bucket (like object creation, update, or deletion). Currently we are checking only object creation events. 
  • Subscription Name: Provide the Subscription Name associated with the Pub/Sub topic. Subscriptions allow services to receive messages from the topic 
  • Region: Provide the region where the GCS Storage is hosted. 

6. Enable Versioning:  

  1. You can select this to enable Object Versioning in GCS. 
  2. It retains the older versions of the object whenever objects are updated/deleted. 

9.2 Amazon Web Services (AWS) Configuration

 

9.2.1 IAM User Creation

1. Sign in to the [AWS Management Console] (https://aws.amazon.com/console/).
2. Navigate to “IAM” -> “Users”.

3. Click “Add user” and provide a username.

4. Assign “Programmatic access” and click on next

5. Attach necessary policies (e.g., AmazonS3FullAccess).

6. Once User Created successfully, inside the user details page, navigate to the Security credentials tab
–> Create Access Key: Scroll down to the Access keys section, then click on Create access key

7. Click on Create access key and click on download the access key for (Access Key ID and Secret Access
Key).

9.2.2 Steps to Configure AWS

 

1. Navigate to Cloud Configurations.
2. Click on add icon that display over AWS Configurations tab.

3. Select the AWS Storage Service checkbox.

  • This option is to enable AWS Cloud Storage service integration.
  • When checked, it allows users to configure storage buckets in AWS.

4. You can select the AWS Secret Manager Service

  • This option is for enabling AWS Secret Manager service.
  • It allows users to store, manage, and access sensitive information such as API keys, passwords, or certificates securely.
  • Only select this option if the integration is related to secret management instead of database.

5. Configuration Details:

  • Bucket Name: Provide the name of the AWS S3 bucket where files will be stored. This bucket acts as a container for data that needs to be archived or backed up. You can provide a new bucket name or can provide the existing bucket name.
  • Access Key: The unique AWS Access Key ID used to authenticate API requests. It is required to grant programmatic access to AWS services.
  • Secret Key: The secret key associated with the Access Key ID. This key ensures secure authentication when accessing AWS resources.
  • Region: Provide the AWS region where the S3 bucket is hosted. The region selection ensures the data is stored in the correct geographical location based on performance and compliance requirements.
  • Account Id: The unique AWS account ID to identify the AWS user or account. It helps to differentiate between multiple AWS accounts.
  • Queue Name: The name of the AWS SQS queue used for processing messages or events. This queue handles asynchronous tasks related to file storage. We can provide any custom name in this field.
  • Queue Name: This is the name assigned to the queue in the cloud service (e.g., AWS SQS). It helps identify the queue used for message processing and communication between different services. 
  • Queue URL:  The unique URL endpoint for the queue. This URL is used by applications to send and receive messages from the queue. It is automatically generated when the queue is created in services like AWS SQS. 
  • Queue ARN: The Amazon Resource Name (ARN) is a globally unique identifier assigned to the queue in AWS. It provides a standardized way to reference the queue across AWS services and policies, ensuring proper access control and integration.

6. Archive File Interval:

Provide the lifecycle policy for AWS, facilitating the replay of files within a specified period. Specifies the number of days after which files will be automatically archived from the storage bucket.

9.3 Microsoft Azure Configuration

 

 

9.3.1 Storage Account Creation

 

1. Sign in to the [Azure Portal](https://portal.azure.com/).
2. Navigate to “Storage accounts”.

3. Click “+ Add” and fill in the required details to create new storage account or choose existing one.
4. Navigate to the storage account -> “Access keys”.

5. Click on “Access Keys” to get connection string.

9.3.2 Steps to Configure Azure:

 

1. Navigate to Cloud Configurations.
2. Click on add icon that display over Azure Configurations tab.

3. Select the Azure Storage Service checkbox.

  • This option is to enable Azure Cloud Storage service integration.
  • When checked, it allows users to configure storage buckets in Azure.

4. Storage Service Name:

  • Provide any unique name to identify the storage service configuration.

5. Connection String:

  • The Azure Storage Account Connection String, which is required to authenticate and connect to the Azure Blob Storage service.

6. Container Name:

  • The name of the Azure Blob Storage Container where the files will be stored.

7. Region:

  • Specifies the Azure Region where the storage account is hosted, such as East US, West Europe, or Central India.

8. Archive File Interval:

  • Provide the lifecycle policy for Azure Storage, facilitating the replay of files within a specified period. Specifies the number of days after which files will be automatically archived from the storage bucket.

9. Event Hub:

  • The Event Hub Name that will be used to publish or subscribe to events.

10. Event Connection String:

  • The Azure Event Hub Connection String used to connect to the Event Hub for monitoring file activities or real-time data streaming.

11. You can select the Azure Secret Manager Service

  • This option is for enabling Azure Secret Manager service.
  • It allows users to store, manage, and access sensitive information such as API keys, passwords, or certificates securely.
  • Only select this option if the integration is related to secret management instead of database.

12. Key Vault Name:

  • Provide the Azure Key Vault where sensitive information like secrets, certificates, and keys are securely stored.

13. Subscription Id:

  • The Subscription Id is a unique identifier assigned to your Azure subscription.
  • This ID links the Key Vault service to the specific Azure subscription under which the services are being used.
  • You can find this Subscription Id in the Azure Portal under Subscriptions.

14. Resource Group Name:

  • A Resource Group Name is a container that holds related Azure resources for your application.
  • It organizes all the Azure services like Key Vault, Storage Accounts, and Virtual Machines into a group.
  • Enter the exact resource group name under which the Key Vault is created.

15. App Client Id:

  • The App Client Id is the Application (Client) ID registered in Azure Active Directory.
  • This ID identifies the application that needs to access Azure resources.
  • You can get this ID from the Azure App Registration under Azure Active Directory.

16. Client Secret:

  • The Client Secret is a secret key generated during the App Registration process.
  • It acts like a password and is used to authenticate the application to access the Azure services.
  • This value needs to be copied while generating the secret in the Azure portal, as it will not be visible again.

17. Tenant Id:

  • The Tenant Id is the Directory ID of your Azure Active Directory.
  • It identifies the Azure AD instance where your application is registered.
  • You can find this value in the Azure Active Directory Overview page.

9.4 IBM Cloud Configuration

 

 

9.4.1 Service Credential Creation

 

1. Sign in to the [IBM Cloud Console] (https://cloud.ibm.com/).
2. Navigate to “Resource List” -> “Storage” –> “Cloud Object Storage”.

3. Click on your instance and navigate to **Service credentials**.

4. Click “New credential” and provide a name

5. After clicking on New Credential, a popup window will appear as shown below. Please fill in the
required details and click on Add to proceed.

6. Create and save the generated credentials like API key, Access Key, Secret Key and service instance
ID details which will be required at later stage.

9.4.2 Steps to Configure IBM Cloud:

 

1. Navigate to Cloud Configurations.
2. Click on add icon that display over IBM-Cloud Configurations tab.