SAP HANA Cookbook

上QQ阅读APP看书，第一时间看更新

Loading data into SAP HANA – data provisioning methods

Based on the requirements, the data provisioning methods for loading data into SAP HANA are different. This recipe briefs you on the tools available for data provisioning and how they work.

Getting ready

This recipe briefs you on the tools available for data provisioning and its application.

How to do it…

The process of data loading is different for the data provisioning tools. In this recipe, we will see in detail the technique and options available for loading data into SAP HANA using different tools. The next recipes deal individually with each tool.

How it works…

As mentioned earlier, the selection of a data provisioning tool depends on the characteristics of the source system and other factors. The mechanism of each data provision technique differs. Let us look at the key factors in each technique that will help in deciding which mechanism to select. Here, the key factors will only be discussed for SLT, SAP DS, and Sybase replication, as loading flat files is just a simple import of file into the SAP HANA system.

SAP Landscape Transformation

SAP Landscape Transformation (SLT) is a trigger-based replication technique. This is the primarily used technique for provisioning of data from the SAP system. The following are the key factors to be considered while selecting SLT as the data provisioning mechanism:

The SLT server has to be installed separately.
Real-time replication of data is possible. If there is a requirement for real-time data replication from a source system, this is the technique.
This works by capturing changes made to the tables on the source side by detecting the triggers sent by the database to update tables. When there are changes to the data in tables, they are replicated to SAP HANA.
We can schedule the replication as a real-time or batch process, and it can be periodic.
Data and metadata from tables can be replicated using this technique.
Selective replication of data is possible by applying filters and selecting only the fields that need to be replicated.
SLT can also be used to load data from non-SAP source systems. The source database must meet some criteria to support the replication server that captures the changes.

SAP Data Services

The SAP Data Services (DS) technique is implemented in most of the cases. While replicating data using this mechanism, the following key factors should be noted:

A separate software component, SAP Data Services, is required and has to be installed
Replication is done by scheduling jobs batch-wise, say hourly or daily
Both data and metadata from tables can be replicated to SAP HANA
Complex transformations and data cleansing are possible
The replication can leverage existing extractors, function modules, and programs in the source system
Data loading from non-SAP source systems is also possible

Sybase replication

This replication technique is a log-based replication. It is specific to non-SAP systems, databases, and so on; for example, ASE, Oracle, MS SQL, and DB2 UDB on Linux, Unix, and Windows (LUW). The key factors for this replication technique are as follows:

Sybase replication uses database log tables to identify changes in the source system. Hence, this will be carried at the database level.
In this replication, the application layer is bypassed. Hence, it is a high-performing, real-time replication mechanism.
Filtering or transformation of data is not possible as the application layer is not involved in the replication. Hence, the mapping will be one-to-one and at the table level.
An exact copy of the data in the source table is replicated into SAP HANA.
It supports real-time data replication from non-SAP systems.

Considering all these features and the key points under each replication technique, it is clear that these data provision mechanisms differ functionally and technically. Based on the business requirement, a solution has to be built selecting one of the preceding data provisioning techniques.

There's more…

We have to think about strategic and technical considerations while deciding the exact data provisioning technique. We will discuss these briefly.

Strategic considerations

First we must understand the operational and corporate requirements. For this, there are certain factors to be considered. These are listed as follows:

Real-time replication or non real-time replication of data
Source system
Type of data—transactional, hierarchical, unstructured, and so on
Complexity of transformations

While understanding these requirements and answering these points, we will come across different situations such as different source systems—SAP, non-SAP, disk-based legacy databases, external files in the form of CSV (comma-delimited files), and unstructured data; and the data provisioning tool will be preferred accordingly. For example, in the case of unstructured data, SAP Data Services is preferred as cleansing of data will be required prior to loading. If data is available in the form of external files, we may not need any tool; data from files can be directly imported to SAP HANA using SAP HANA Studio. If required, we can also use SAP Data Services to load from files. If real-time data replication is required, SLT is preferred as this helps in loading up-to-the-minute data from all source systems that are compatible with SLT, thereby maximizing the availability of updated data to the end users. When huge transformations and data cleansing is required, we go with SAP Data Services.

Technical considerations

On the other hand, technical considerations also have to be taken into account before deciding on the replication tool. This includes the following factors:

Data replication capabilities
Source system compatibility
Administration/configuration aspects

The following table gives a clear picture of the entire comparison of different data provisioning techniques:

The first comparison is with regards to data replicating properties:

The next comparison is with regards to the source system compatibility, as shown:

The comparison with regards to the administration and configuration aspects is shown here: