The Twin Goals

esProc aims to facilitate algorithm description and speed up execution.

A difficult description results in high development cost and makes coding process really hard. The slow execution is time-consuming and thus the program is unserviceable. Essentially, the fast execution comes from efficient algorithm description. Since we can’t enhance hardware performance through software, creating more efficient algorithms is the only choice. But if it is hard or impossible to describe a fast algorithm, the efficient execution is impossible, too.

How does esProc do to achieve both goals?

As we know, computers can only recognize and execute formalized programming statements. The process of using computers to solve problems is one that translating algorithms into a certain programming language. If a language’s syntax system and data types are badly-designed, it’s probably that the translation is more difficult than the solution. We hold a great, efficient algorithm but it’s hard to express it or just can’t realize it. That’s the situation that you have the key but don’t know how to use it and then you resort to the clumsiest way of breaking the door open.

Unfortunately this is not uncommon for structured data computations, especially for multi-step procedural computations and order-related computations. SQL, the mainstream programming language for dealing with structured data, is responsible for this. Here are two simple cases:

  1. Finding the number of days when a stock consecutively rises

It’s a piece of cake for a Java or C++ programmer. Create a counter whose initial value is 0, sort data by date and perform traversal, during which add 1 in the counter if the price rises and reset the value if it falls, and check the maximum value later. The solution is natural, but difficult to implement in SQL. SQL will create numbers for dates and a grouping mark, put a rising date and the last date together and a falling date and the next together, and then calculate COUNT() over these groups. The code is lengthy and not easy to understand.

  1. Finding top 10 from 1 billion records

For a programmer familiar with algorithms, there’s no need to sort all the records. They will create a ten-member empty set and traverse the records while ensuring that members of the set are always the top 10 of the traversed records.  Just one traversal is needed and memory consumption is small. However, it’s hard, even impossible to express the algorithm in SQL.  Logically we can only sort all records and then get the top 10. We can depend on database optimization when the computation is simple, but there’s no better way for a complicated computation (like one involving a subquery or a JOIN operation).

esProc can easily describe the fast algorithms in the two cases.

Precisely speaking, esProc isn’t responsible for producing solutions. It’s the programmers’ work to think of efficient algorithms. But esProc can make it easier, more convenient and more intuitive to code these algorithms by providing necessary data types and concise syntax. Of course there isn’t the so-called perfect and final tool, but we won’t stop make esProc better in achieving the twin goals.