Customers Who Bought the Same Products

Target:

Take esProc as the computing layer for Java to find out: Customers who have bought the same specified products. There are two tables in database. Part of the data is as below:

esProc database computing-1

The standard approach is:

1. Group AllOrders table by customer_id.

2. Get sets of item_id.

3. Remove duplicate members.

4. Finally, perform filter by the specified item list and you can get the required target.

Check the pic below to make it clear:

esProc database computing-2

1. Retrieve data

Firstly, write SQL to retrieve data from each table. You can click on a cell to visually see the calculation result, such as AllOrders in A2 as the pic below shows. Don’t forget to define the external parameter: A specified product list.

esProc database computing-3

2. Group data

Next, reference A2 in A3 to group A2 by customer_id and rename the grouping criterion customer_id customer, as shown below:

esProc database computing-4

Note: ”~” represents a member of each group.

Blue font in the above pic indicates that this member is a set. Click on it to see the subordinate members:

esProc database computing-5

The above pic shows members of the value of OrderList field in A3’s first member. Obviously, the result of grouping is a set whose members are also sets. Sets make it easy to process data after grouping.

3. Objects reference

Then fetch item_id to form a new column: ItemList. Check cell A4 in the following pic:

esProc database computing-6

Through object references, esProc enables easy access to members of a multi-level set.

The next step is to filter away duplicate values from each group in A5. For example, the non-duplicated items that customer 1 purchased is shown below:

esProc database computing-7

As you can see, A3-A5 performs the computation step by step by dividing the computational target into smaller sub-targets. The advantages of simplifying a complex computation into multiple simple steps are:

Making computation more orderly and steady;

Enabling intuitive observation for each step;

Reducing potential bugs.

You can also do it in one step like B3 as shown in the pic below. B3 equals to A3-A5. The advantage is that the code is shorter and more agile.

esProc database computing-8

Next you reference the external parameter to filter out the desired customers, as B6 does.

esProc database computing-9

esProc achieves the computational object easily through object references. You can call the result in a reporting tool or Java application via JDBC.

4. Multi-table association
If you want to use SQL to further compute the average value of B6’s customer account balances, then multi-table association and nested computation will be involved. But through the object reference of esProc, you can avoid the complex algorithms.

Let’s insert a row in A3 to create an object reference:

esProc database computing-10

This means you can reference a record in A1 with the corresponding customer_id field in A2. And B7 inherits this reference relationship:

esProc database computing-11

Now in B8, you can easily compute the average value of B7 by using object reference.

esProc database computing-12

Note: An expression will intelligently changes without manual intervention while move between the cells. For example, the expression remains consistent when moving from A4 to B4, and the expressions in A5\A6\B7 adjust themselves intelligently to respective cells.

5. Set computations

How about the account balances of other users who are the users in B4 after those in B7 are deducted? How about their average balance? Look at C9 below, where the expression “B4\B7” represents the “difference” of the two sets.

esProc database computing-13

Problems like this can be solved intuitively by making use of set operations. Without support for explicit sets that esProc has, SQL has to deal with such a problem by transforming it into a complex query of multi-table association.