Data Management in SAP HANA Cloud
The storage options in SAP HANA Cloud
- Best used for frequently changed and accessed data, i.e., hot data
- Offers high performance in data analysis and transformation
- High Total Cost of Ownership (TCO)
- All the power of the SAP HANA engine
Native Storage Extension (NSE)
- Best used for data that changes and needs to be accessed less frequently, i.e., warm data
- Offers easy access to data and seamless tiering and virtualization with in-memory
- Lower TCO than in-memory
SAP HANA Cloud data lake
- Best used for data that rarely changes or does not need to be accessed frequently, i.e., cold data
- Large storage size, up to a petabyte-scale
- Able to store and perform analytical queries in structured and unstructured data
- Lowest TCO
About Data Tiering in SAP HANA Cloud
Not all data is equal. How often you access it, its type, its operational usefulness, and security requirements will all affect its value. Time is another critical factor when assessing the value of data. As data ages, it becomes less relevant for analytics and is accessed less frequently. When you combine these different data values with limited IT budgets, there’s a need for cost-effective data management strategies that prioritize high-value data.
Data tiering gives you that cost-effective storage by assigning data to different storage and processing tiers based on its value. With SAP HANA Cloud you can split data between different temperature tiers: hot, warm, and cold. As the data’s value changes, you can move it between each tier.
Hot data is stored in-memory. Hot storage is ideal for high-value data that requires real-time processing and analytics. In SAP HANA Cloud, hot data is both the highest performance and highest TCO storage option.
As the data becomes less of a priority, you can move it down to the warm storage. Data in warm storage resides outside in disk, which in SAP HANA Cloud is the Native Storage Extension (NSE). Warm data isn’t fully loaded into memory, so this option is more cost-effective than hot data but still has very low latency. It’s best for data that doesn’t need to be accessed frequently.
Cold storage is for data that is not accessed often. This kind of data is managed separately from the SAP HANA Cloud database but can still be accessed by using the data virtualization capabilities. In SAP HANA Cloud, we recommend that you store cold data in the integrated data lake. This combines the massive storage capacity of a data lake – up to the petabyte scale – while keeping it all within a structure that simplifies and accelerates data analysis. The SAP HANA Cloud relational data lake ensures that applications can rapidly access data despite massive data volumes.
SAP HANA Cloud offers a broad choice of storage to get the best performance with the lowest TCO.
Data Virtualization and Data Replication in SAP HANA Cloud
Finding the right balance between costs and performance is a constant challenge in data management. SAP HANA Cloud makes it easier to balance those needs by giving customers control over when data can be virtually accessed, or when this data is replicated for faster consumption and transformation.
Virtualization means that the data is remotely accessed and remains in the storage location they currently are.
When you do need to access either warm or cold data, whether it’s stored in SAP HANA Cloud or on remote sources, this is when virtualization plays a major role. You can virtualize data stored both within SAP HANA Cloud’s storage options or from remote sources – and this includes your SAP HANA on-premise landscape.
Replication means the data is duplicated and stored as well in SAP HANA Cloud, so it can be accessed faster.
While virtualization allows organizations to save on storage costs by keeping the data in “colder”, slower storage solutions, it impacts how fast applications can access and transform this data. Slower access can sometimes impact business results or critical services, which means that for important data it makes sense to invest in a “hot”, or in-memory, storage solution.
This means replicating this data and keeping it at your fingertips, immediately accessible and ready to be processed. The downside is, of course, that in-memory storage can get expensive.
Finding balance with SAP HANA Cloud
If costs are the bigger problem, then SAP HANA Cloud’s virtualization capabilities will help you create virtual tables to avoid increasing your storage costs. Virtual tables are stored in SAP HANA Cloud as if they were local tables, but lets you avoid copying this data into your SAP HANA Cloud storage space.
After you create your remote sources, you can start to access data in them by creating virtual tables. Find out here how to create a virtual table in SAP HANA Cloud.
If later you decide it’s better to replicate parts of this data, you can do it even for parts of structured tables. Find out here how to replicate data into SAP HANA Cloud using SDI.
Balance and flexibility with SAP HANA Cloud means making your own decisions at the right time, with a little help.
The SAP HANA Cloud data lake
Big data is here, no matter the goals of your organization. Sometimes data is not only generated in large volumes, but it also must be stored for long periods of time, in case you need to comply with rules and regulations related to data. We call this “cold” data, the data that typically is written once, updated rarely, and analyzed for patterns. Organizations face the challenge of finding a sustainable, cost-effective, and efficient solution for all this data that mostly does not need to be accessed or used often, but still needs to be stored.
This is where the SAP HANA Cloud data lake can be very effective. It’s a data lake that can reach a petabyte scale, as well as store structured and unstructured data. This makes sure that you can store the data without structuring it first and then use it for analytics, if necessary. Built on SAP IQ, the data lake provides excellent performance for analytics across large volumes of data. With the SAP HANA data lake, you avoid the risk of creating an uncontrollable, unusable data dump.
The SAP HANA data lake
The data lake in SAP HANA Cloud is integrated, but at the same time independent from the SAP HANA database when it comes to storage and compute. In SAP HANA Cloud, you can choose to use a data lake related to each SAP HANA Cloud instance you create. It’s a simple click that allows you to enable a data lake in your instance, either when you first create it or later, as needed . You can also take advantage of cloud elasticity to scale your data lake storage and compute up or down whenever necessary.
For the initial release of SAP HANA Cloud, a data lake starts with 1TB of storage size and can go up to 90 TB, in 1TB increments. The maximum compute capability of a data lake is 162 vCPU. To access your data lake, you can use the existing SAP HANA tools, like the Database Explorer. The data lake can support ingestion of data from object storage locations (eg. Azure blob store, Amazon S3) at a rate of 1TB per day per dedicated vCPU per table, up to a maximum of 16TB per day per table.
Another important point is that the SAP HANA Cloud data lake inherits the security and data protection available for all of your instances of SAP HANA Cloud automatically, without any action needed from you.
Learn more here about SAP HANA Cloud Security.
The Database Explorer in SAP HANA Cloud
If you are new to SAP HANA databases, then you need to understand the SAP HANA Database Explorer. This is one of the most important tools at your disposal to manage your SAP HANA Cloud instance.
The Database Explorer offers a graphical interface and the SQL console, giving you choices on how you want to access and manage your data. Among the actions you can do in the Database Explorer are:
- Adding and managing remote sources
- Querying the database
- Modeling data
- Accessing, importing and exporting data
One of the most important parts of the Database Explorer interface is the Catalog, which you can find on the top left-hand corner of the screen. Each database you connect to your Database Explorer has its own catalog, and this is where you can see data, remote sources, adapters, and more.
Whenever you want to view, add or manage any of the items in the catalog, just right click on the item and choose from the options available there.
The Database Explorer is a powerful tool that gives you the ability to manage, query and add data from multiple sources your database. To learn how to use it in depth, please see the Getting Started with the SAP HANA Database Explorer guide.