How to do a Fuzzy Text Search on JSON #JoelKallmanDay

More and more apps I deal with store data as JSON documents in the Oracle Database. It is exceptionally convenient for the developers but doesn’t always make it easy to know exactly what data we have stored.

The good news is that Oracle offers multiple ways to help you understand precisely what data you have stored in those JSON documents. For example, you can use the built-in JSON Data Guide, which will trawl through all your documents and return a list of all the attributes you have stored.

But suppose you are interested in searching through your documents and only returning those that contain a particular word, value, or a variation thereof. In that case, you will want to take advantage of Oracle’s fuzzy text search or approximate string matching.

Imagine we have a table that stores movie reviews as JSON documents. I’m doing this demo in 19c, so I’m using a VARCHAR2 column, but from 21c onwards, you can use a JSON column.

CREATE TABLE movie_reviews( 
                     title varchar2(200),
                     cust_id NUMBER(26), 
                     cust_reviews varchar2(32000)
                     CONSTRAINT cr_is_json CHECK (cust_reviews IS JSON));

If you really want a JSON column in 19c, you can take advantage of this trick I learned from @Connor_mc_d.

 CREATE TABLE movie_rewiews( title varchar2(200), 
                             cust_id NUMBER(26), 
                             cust_reviews BLOB,
                             CHECK (cust_reviews IS json format oson));

Each review document contains details on the movie id, the star rating, and the review.

{“movie_id”: 5641,
“star_rating”: 1,
"Feedback":" Loved the tv show, but hated the movie. I am so disappointed."
}

We have been asked to find all the reviews that contain the word disappoint or variations of it. To facilitate a fuzzy text search, we need to create a text-based index on the feedback column inside the review documents.

CREATE SEARCH INDEX review_search_ind ON movie_reviews(cust_reviews) FOR JSON;

Once the index is created, we can run a fuzzy text search using the following query:

SELECT m.title, m.cust_reviews.feedback AS customer_review
FROM   movie_reviews m
WHERE  JSON_TEXTCONTAINS(m.cust_reviews, '$.feedback', 'fuzzy((disappoint))');

This results in the following entries being returned.

TITLE                    COMMENTS
-------------------- ----------------------------------------------------------------
Can You Ever Forgive Me? This movie was so disappointing
Top Gun                  Tom Cruise never disappoints. Definitely worth a watch.
Vice                     Perry’s performance in this movie is just so disappointing
Baywatch                 Loved the tv show but hated the movie. I’m so disappointed.
La La Land               Rent this movie you won’t be disappointed!
Batman                   Complete Disappointment

Alternatively, you can use the abbreviated syntax, which will return the same results as above:

SELECT m.cust_reviews.feedback AS comments
FROM movie_reviews m
WHERE JSON_TEXTCONTAINS(cust_reviews, '$.feedback', '?disapoint');

You can also use the stem search operator $. That will match verb forms sharing the same stem, so $disappoint will match “disappointing,” “disappointed,” and “disappoints,” but not “disappointment.”

SELECT m.cust_reviews.feedback AS comments
FROM movie_reviews m
WHERE JSON_TEXTCONTAINS(cust_reviews, '$.feedback', '$disapoint');

This blog was made possible by the lovely Roger Ford, the product manager for Oracle Text and JSON, who has taught me everything I know about text searches in the Oracle Database.

How to setup VNCServer on an OCI VM?

Now we are all working from home, I’ve noticed that my MAC laptop is severely overloaded when I do live demos during webinars. After all, it’s running my camera, Zoom, PowerPoint, my Java app, and monitoring tools.

So, I decided it was time to move my demo environment to the Oracle Cloud, where I quickly provisioned a 2-OCPU VM running Linux following the instructions in my previous blog post.

Screen Shot 2020-08-11 at 12.56.23 PM

Once I had my VM up and running, I wanted a proper desktop experience, so I needed VNC.

After a quick google search, I found the video below, which provides a very easy to follow, step by step guide to installing and configuring TigerVNC VNCServer on OCI infrastructure. I followed all of these steps except for the final stage where they describe adding the VNCServer to your firewall.

I’m married to a security expert, who strongly advised against this approach. He told me it would be far more secure to use an SSH tunnel instead of opening the firewall for the VNC port.

Below the video are the full set of commands I used in my setup, including how to establish the ssh tunnel, so you can quickly cut and paste them.

Continue reading “How to setup VNCServer on an OCI VM?”

Data-Driven Apps – What are they and the easiest way to develop them

At the moment, we hear a lot about how businesses need to become data-driven to remain competitive, how business need to understand their customer’s needs and quickly deliver value to those customers.

But how do you do that?

You take advantage of data-driven apps that allow users to create value or insights from data in real-time. 

What are Data-Driven Apps?

Data-driven apps operate on a diverse set of data (spatial, documents, sensor, transactional, etc.) pulled from multiple different sources, often in real-time and create value from that data in very different ways to traditional applications. For example, they may use Machine Learning to make real-time recommendations to customers or detect fraudulent transactions. Or use Graph analytics to identify influencers in a community and target them with specific promotions or perhaps use spatial data to keep track of deliveries.

These apps are also frequently deployed on multiple platforms, including mobile devices as well as standard web browsers, which means they need a flexible, scalable and reliability deployment platform. Given the demands on these apps, they need to be continuously developed to adapt to new use cases or user needs, and all updates must happen online as they have to be available 24×7.

When building data-driven apps, developers need to leverage an ever-increasing set of data processing and machine learning algorithms to meet these requirements.

So how should you go about developing and deploying data-driven apps quickly, efficiently, and more importantly, in a maintainable way?

Data-Driven Techniques and Technology

You take a data-first approach, or as A. Neil Pappalardo put it ‘A Minimize Code, Maximize Data‘ approach. In other words, you bring the algorithms to the data, not the data to the algorithms.

In the video below, I explain how to take advantage of the built-in features and functionality of the Oracle Database to develop and deploy data-driven apps efficiently. I also share some easy to follow code examples to demonstrate how much simpler your application code can be if you use this approach! Continue reading “Data-Driven Apps – What are they and the easiest way to develop them”

JEFF Talks From Kscope18

The first day of the ODTUG Kscope conference is always symposium Sunday. This year’s Database symposium, organized by @ThatJeffSmith, consisted of multiple, short, rapid  sessions, covering a wide variety of database and database tool topics, similar to Ted Talks but we called then JEFF Talks!

I was lucky enough to present 3 of this year’s JEFF Talks that I thought I would share on my blog since there wasn’t a way to uploaded to the conference site.

In the first session I covered  5 useful tips for getting the most out of your Indexes, including topics like reverse key indexes, partial indexes, and invisible indexes.

Next up was my session on JSON and the Oracle Database. In this session, I covered topics like what data type you should use to store JSON documents (varchar2, clob or blob) the pros and cons of using an IS JSON check constraint, and how to load, index, and query JSON documents.

In my finally JEFF talk I covered some of the useful PL/SQL packages that are automatically supplied with the Oracle Database. Since the talk was only 15 minutes I only touched on 4 of the 300 supplied packages you get with Oracle Database 18c but hopefully it will give you enough of a taste to get you interested in investigating some of the others!

 

 

Building a simple Node.js application with Oracle

Anyone who knows me will tell you, my expertise is the Oracle Database and not web applications or JSON. But my new role is all about pushing myself out of my comfort zone, so I wanted to get some hands on experience with both application development and JSON in the Oracle Database.

My colleague, Mark Drake, the PM for JSON in the Oracle Database, suggested I get my feet wet by trying the new Movie Ticketing Application tutorial available on the JSON page on Oracle.com.

The Movie Ticketing Application is written in Node.js and connects to the Oracle Database via a REST Data Service. Since I recently signed up for the Oracle Database Exadata Express cloud service, I had the perfect Database setup for a web application.

Anyone who knows me will tell you, my expertise is the Oracle Database and not web applications or JSON. But my new role is all about pushing myself out of my comfort zone, so I wanted to get some hands on experience with both application development and JSON in the Oracle Database.

My colleague, Mark Drake, the PM for JSON in the Oracle Database, suggested I get my feet wet by trying the new Movie Ticketing Application tutorial available on the JSON page on Oracle.com.

The Movie Ticketing Application is written in Node.js and connects to the Oracle Database via a REST Data Service. Since I recently signed up for the Oracle Database Exadata Express cloud service, I had the perfect Database setup for a web application.

Continue reading “Building a simple Node.js application with Oracle”

%d bloggers like this: