Best Practices For Large Volume or IoT Workloads

Over the last few years, there has been a rapid surge in the adoption of smart devices. Everything from phones and tablets to smart meters and fitness devices can connect to the Internet and share data. You only have to follow @MarkRittman and his experiences with getting his kettle to boil remotely to see just how many devices within your own home can connect to the internet.

With all these smart devices comes a massive increase in the frequency and volume of data being ingested into and processed by databases. This scenario is commonly referred to as the Internet of Things or IoT.

Some people assume that a NoSQL database is required for an IoT workload because the ingest rate required exceeds the capabilities of a traditional relational database. This is simply not true.

Let’s look at some more industrial examples. Telecoms process tens of millions of Call Detail Records (CDRs) per second or connected manufacturing equipment, where every machine in a production line sends information about which component is currently being worked on; you quickly realize that these types of workloads have actually been around a really long time. And if we look at how the industry-leading company in each of these fields manages their workloads, we find they have all been using the Oracle Database.

With that in mind, I began working with Cecilia Gervasio Grant, Weifeng Shi, and my boss, Juan Loaiza, to define a set of best practices or management techniques not just to facilitate IoT workloads but to excel at them. We drew on our years of experience with leading telecoms, financial institutions, retailers, etc., to make realistic recommendations and to share real-world code examples, as well as performance numbers.

Throughout the paper, we use a simple analogy of someone going grocery shopping to help explain the reasoning behind each of the tuning techniques recommended. For example, if you know you are going shopping for a lot of stuff, you get a big cart right away rather than selecting a small handbasket. The same is true in the database. You want to allocate a BIGFILE tablespace with large initial extents so that you can ingest and store large volumes of data efficiently.

You would also never select one item at a time and pay for it before you select the next item on your list, which is akin to a single row insert followed by a commit within the Oracle Database. Instead, you would walk through the store, collect all of the items on your list, and then go to the checkout once, which is why we recommend using an array insert followed by a commit within the database.

We hope you find the recommendations outlined in the paper useful and would love to hear if you have used these or other techniques in your IoT or high-volume ingest workloads.

2 thoughts on “Best Practices For Large Volume or IoT Workloads”

  1. Hi,

    Thanks for the great article, very useful in non-IOT scenarios as well. We had one requirement to ingest lot of audit logging data for every screen by every user and in the solution, we used partitions (local indexes) – the only issue that we faced was during the maintenance (dropping old partitions), we’re invalidating lot of cursors and there is no way to avoid this. This is 24×7 application with a high degree of concurrency.

    Oracle should find a way to avoid reparsing during partition drop, I know it is a fundamental which is hard to change but maybe I put to product managers if they can find a way. 🙂

    1. Hi Anjul,

      Unfortunately there is on way to avoid the cursor invalidation in 11.2 and 12.

      However, in 12.2 we have done some work to implement fine-grained cursor invalidation so TRUNCATE PARTITION does not invalidate cursors, except for cursors that use global index access in certain cases.


Leave a Reply

Your email address will not be published. Required fields are marked *