esProc is a data computing language with the powerful TSeq and cursor data object. It is especially optimized for the computing over (semi) structureddata, and capable of handling various complex computing problems easily. Because it is designed to serve the sole purpose of computing, esProc is simply structured and easy to grasp. Users can effortlessly adapt to esProc IDE and grasp its development method.
esProc is not the object-oriented language and free from any complex concepts like inheritance or reload. esProc cannot be used to develop the complete application. Instead, itexcelsathandling the computing for the applications.
esProc has the agile syntax, grid style script, and complete debugging functions. It is a more suitable tool for application programmers to handle the multi-step business computing involving complex algorithm, and the combined computing of various data sources. Comparatively, esProc is not fit for the system programmer to develop the infrastructure or a whole package of utilities.
With esProc, programmers can be focus more on business understanding than the non-technical implementation. Thus, the difficulty of converting the business logics to program code is reduced dramatically.
esProc is the pure Java script, as easy to integrateand open as Java would be. esProc is adynamic language, agile and flexible like other dynamic languages. Theoretically, the performance of esProc is not better than Java, but exceeds other dynamic languages. esProc is especially optimized for the (semi-) structureddata and inbuilt with the abundant class library to support the data computing. esProc also supports the inexpensive scale-out and parallel computing to offer a much higher performance to programmer.
esProc vs. SQL
SQL is fit for the simple query achievable in one step. esProc is ideal to implement the complex business logics in multiple steps.
esProc optimizes the structureddata computing, and supports the step-by-step computing, more complete setlization, and object reference, which makes the computing on structureddata ever more effortless. esProc also allows users to design the proper algorithm and data storage style, so as to achieve the higher performance.
SQL is characterized with the simple syntax and great university. There are a great many applications support SQL with abundant resources.
esProc vs. Database
esProc neither have the metadata mechanism, nor the SQL-like language. Although esProc storage function can operation on data, it is just to get the much higher throughput performance. esProc can act as the database to some extent, with regard to the data computing. However, it cannot replace the database completely.
esProc can be based on database, local files, distributed file system, and other data storage solution. It is aimed to complete the complex procedural computing task, especially those computing task that is hard for the traditional SQL/SP to implement. Essentially, esProc is the stored procedure which is economic at scaling, better in performance, and easier for developing the stored procedure.
esProc is easy to scale out and capable of deploying on the normal PC; While databases are hard to scale out, they normally requires the specific high end equipment.
esProc has the open interface for external connection, enabling users to expand with Java language. Working with the external program, esProc can be used to process the non-structureddata, while the database system is relatively closed and only capable of processing the structureddata.
esProc vs. Hadoop
esProc parallel system and MapReduce
Most reporting tools support JDBC directly, such as BIRT, Crystal Report, and JasperReport, which canintegrate with esProc seamlessly without any furthercoding. Firstly, users can complete the data source computing with esProc scripts; Secondly, call the esProc script via JDBC in the reporting tool. The result from esProc will be involved in the reporting design as the standard stored procedure. With this method, the complex data source computing in the report can be handed over to esProc, and the reporting tool can focuse moreon presentation.
Similarly, with JDBC Driver, users can incorporate the result of esProc high performance computing or the big data parallel computing into the report or JAVA code.]]>
3.Check Point instruction:
4.Runtime environment: JDK1.6 or above. This version of JDK is already included in installation package
5.Free edition doesn’t require registration code, it worksafter installation.
6. Standard edition has to be paid to get registration code, and the listedpriceis for 16 parallelel threads; Purchase or inquiry other parallelisms or purchase in bulk, please email to firstname.lastname@example.org.
Learn more about esCalc, please click: http://www.raqsoft.com/product-escalc]]>
2.The esProc Editions:
4.JDKRuntime environment: JDK1.6 or above. This version of JDK is already included in installation package
5.Registration code of free version is included in installation package, running after installation
6. Registration code of free edition and distribution edition can be obtained at Raqsoft website, and registration code of developer edition and server edition can be obtained after the payment, the listed price is for 16 parallelism; Purchase or inquiry other parallelisms or purchase in bulk, please email to email@example.com.
7.Free edition only supports4 parallel threads; distribution edition provides default 16 parallel threads, if more parallel threads is required on distribution edition, please email to firstname.lastname@example.org to apply it without any charges.
Learnmore about esProc please click: http://www.raqsoft.com/product-esproc.]]>
esCalc provides two editions and the registration codesare different in each edition, but they share one installation package. Registration code provided here is only for free edition, for other editions, please contact us.
Please read carefully the License Policy of esCalc because if you decide to download it, any disputes on your registration information and privacy will be protected by this agreement and declaration.
|Installation Package(build 14.03.04)|
|Update Time: 03/04/2014 File Size: 64.5MB|
|Free Registration Code: JyNn5-jm3RN-nzjI9-KbgoLMg|
|Editions: Free; Standard|
|Runtime Environment: JDK1.6 or above|
|esCalc_Getting Started||esCalc_Tutorial||esCalc_Code Sample||esCalc_Function Reference|
|esCalc_User Reference||esCalc_User Exploration||esCalc_Case|
High Cost of Database Software& Hardware and Management
With the explosive growth of global Informatization, the cost on software & hardware and management keeps raising.
To deal with the increasing computation pressure of database, a common method is to enlarge CPU or purchase new server. Since database needs high-performance server or dedicated server integrating with software and hardware, the cost becomes very high. In the meanwhile, Adding a new CPU\storage space\server node expansion means buying more expensive database license and further increase cost.
To decrease the software and hardware cost of database, we need a software which can share the computation pressure of database.
Storage and management cost caused by redundant data accounts for very high proportion in enterprise’s expenses. With new business demands emerge continuously, a continuously expanded, N-level structure, and tree-dimensional redundant data system will be generated.
To reduce the cost on hardware expansion and management, we need the software which is storage extendable and affordable with complete computing ability.
Complex computation goal amplifies the risk of application system
With market environment and business demands becoming more complicated, business computing goals are simultaneously becoming more complex. However, our application system is not yet ready to meet such trend.
SQL/SP provides complete computation ability for structured data, which works fine with simple computation, but exposes its disadvantages on more and more complex computation tasks, such as computation without steps, non-explicit set, unordered computation, lack of object reference mechanism, etc.
Complex computation with SQL/SP has become the bottleneck of database development due to inconvenience and inefficiency.
Reporting tool supports simple computation of single database very well, while the development cost will increase dramatically when it comes to complex computation situation.
For single database source, the inherent drawback of SQL/SP will greatly impact development efficiency. For multiple data sources computation, for instance, consolidating multiple data sources into single data source, developers often adopt expensive data warehouse or table join operation which is inadequate on computation and bundled with specific reporting tool, or even use self-defined data sets with lower efficiency.
JAVA is the mainstream language for developing application system, but the complex computing tasks still stop the development efficiency of Java applications.
For single database computation, Java can integrate SQL to realize it. However, SQL does not support step-by-step computation, couldn’t simplify complex task into simple steps, unable to observe details and realize real debugging. For computation among multiple or non-database data sources, many developers have to choose expensive, low-efficient data warehouse.
Business opportunity will be lost by bloated BI solution
Different from more than a decade ago, current market changes faster, demands alter more frequently and thus, requirements for BI tools are higher. But the popular solution BI is bloated and expensive which misses the business opportunities.
Market needs agile Desktop BI with characteristics of directly running on desktop environment and reducing dependence on dedicated server and technical team. The desktop BI should also have complete core BI function and can be used alone by business experts. It also should allow users to conduct creative work and figure out reliable result before business opportunity is missing.]]>
Standalone grouping performance
Below please find the performance comparison between esProc and esProc in standalone environment on handling the big data grouping and summarizing. In this article, we respectively compare the three scenarios of single thread, four threads, and column storage. Each scenario can be classified into three types according to the grouping columns and summarizing columns. Two sets of data are tested, with the wide table of 100 columns and the narrow table of 10 columns.
Hardware Dell Power Edge T610, CPU Intel Xeon E5620*2, RAM 20G, HDD Raid5 1T
Software CentOS 6. 4, JDK 1. 6, Oracle 11g,esProc 3. 1
As can be seen, the performance of Oracle is greater than that of esProc for the standalone machine and single threads. However, the parallel option of Oracle does not have any actual function. For the standalone machine with 4 threads, the computingperformance of esProc has reached or even exceed that of Oracle. If running on the column storage, esProc can gain an advantage in performance in a certain order of magnitude.
Standalone grouping performance
Below please find the performance comparison between esProc and Oracle in standalone environment on handling the big data joining operation. In this article, we respectively compare the three scenarios of single thread, four threads, and column storage. Two sets of data are tested: Association between wide table and narrow table.
Hardware Dell Power Edge T610, CPU Intel Xeon E5620*2, RAM 20G, HDD Raid5 1T
Software CentOS 6. 4, JDK 1. 6, Oracle11g, esProc3. 1
As can be seen, when the computing becomes more and more complicated, the advantage of esProc parallel computing gets more obvious. By taking the advantages of multiple threads, the performance of esProc has already surpassed Oracle.
esProc supports the lightweight parallel computing, the huge data volume, great computingworkload, and high degree of concurrency for distributing the workload to multiple nodes evenly. To support the data sharing in node, the computing performance with multiple threads can be elevated effectively. For the data between nodes, users can select the data swap in external storage or direct memory exchange, depending on the size of result set. Users can strike a balance between the fault tolerance and performance. For those small tasks with high degree of concurrency, the in-memory computing can be employed to boost the performance. For those time-consuming great job, the external storage can be used to ensure the reliability.
Cluster grouping performance
Below please find the performance comparison between esProc, Hive, and Impala in handling the big data grouping in the cluster environment. According to the difference between grouping columns and summary columns, there are four algorithms. Two sets of data are tested, with the wide table of 106 columns and the narrow table of 11 columns.
Hardware 4 PCs, CPU Intel Core i5 2500, RAM 16G, HDD 2T/7200rpm, LAN 1000M
Software CentOS 6. 4, JDK 1. 6, CDH5. 0beta, esProc 3. 1,
In-memory exchange is more powerful for the small data set, while external storage exchange is more stable for the big data set. esProc supports both methods and in-memory exchange is adopted in the above test. As can be seen, for Implala that only supports the in-memory exchange, the performance of esProc is better; for Hive that only supports the exchange in external storage, the performance of esProc would rise for multiple times.
For other scenarios when the big data set exchange in external storage is a must, esProc offers a more efficient solution than HDFS does, and is more powerful than the MapReduce-based Hive.
Cluster join performance
Below please find the performance comparison between esProc, Hive, and Impala in handling the big data joining in the cluster environment. Two sets of data are tested: Join wide table and narrow table.
Hardware 4 PCs, CPU Intel Core i5 2500, RAM 16G, HDD 2T/7200rpm, LAN 1000M
Software CentOS 6. 4, JDK 1. 6, CDH5. 0beta, esProc 3. 1,
As can be seen, when the computing intensity gets greater, the performance gap between Implala and esProc is narrowed gradually. This is probably the result of local code mechanism of Impala. In this comparison, although Hive and esProc are the same in interpreting and execution, the performance gap between them is evengreater. So, we can conclude that the in-memory computing can boost the performance significantly.]]>
Scale-up and Scale-out
esProc offers multi-thread computing on standalone machine to meet the need of expanding the hardware capacity. Users can set the number of threads according to CPU cores and the computing workload, and expand the memory and hard disk at any time. esProc has perfect scale-up ability and supports the large memory, local disk file, and redundant data.
In esProc, the multi-node structure with no center is adopted to support the parallel computing capable of scale-out. The requirement of esProc on computingnode is relatively low. All kinds of servers can act as the computingnodes – no matter it is the low end PC or the midrange and high end servers. esProc can run on both Windows and Linux.
The nodes in the cluster can increase or decrease freely as necessary. The node launching the task can also be replaced and assigned by the programmers. One node machine can act as the sub, or the main to distribute the task to the below nodes. By doing so, the calling with multi-level nests can be formed.
Intelligent job Distribution
With the controllable task distribution mechanism, the parallel computing handled with esProc is much more flexible. Programmers can set the scope of node involved in computing, control the scope of subtask flexibly, and distribute the computingworkload based on the characteristics of tasks and nodes. In addition, the external parameters can be used in the parallel computing, and users can self-adjust the scope of nodes, number of tasks, and size of task. esProc is especially designed for the small and medium size clusters. Such users have relatively less nodes, and less possibilityto encounter errors. They usually can run normally for quite a long time. Such users are more eager to have a controllable job distribution mechanism.
esProc supports the node auto-selection and fault tolerance. esProc will search the free node in the user-specified scope, and replace the node automatically to proceed in case of error. esProc supports local files and LAN file to further lower the hardware requirements on scale-out. esProc also supports HDFS and other redundant data to ensure the reliability and stability of big data computing. Users can choose between the cost efficiency and reliability according to their computing task.
Data share and exchange
esProc supports the data sharing within the nodes. If the same data is used in every thread of the task, then esProc will allows you to set it as the global variable for sharing in multiple threads of nodes to boost the performance and memory usage efficiency. For example, when node machine is used for the big data computing of first associating and then summarizing, the associated table can act as the global constant in most cases. In order to avoid the conflict of concurrent tasks, esProc allocates the private task space for each task. The global variable of the same name in different tasks will not conflict with each other. Designed to cover both the global variable and the private space, esProc can elevate the performance and effectively guarantee the stability of tasks.
esProc supports data exchange methods between nodes: the direct in-memory exchange and the external storage file buffer. For the small result set, the data exchange between nodes can be implemented directly via in-memory exchange. For the big data set, you can use the file in the external storage to exchange the data. Users can choose the data exchange method based on the task characteristics, so as to keep the balance between the fault-tolerant ability and performance. For the small tasks running concurrently, the in-memory exchange can achieve higher performance. For the individual big task, the exchange in external storage can ensure the reliability.
esProc is especially designed for the middle and small cluster. The fault rate of such cluster is slim, and the reliability is relatively high. So, esProc provides two methods for users to choose freely, and allows users to take the performance as a top priority.
All in all, esProc features the script language specially designed for the distributed computing with complete computing architecture. Users can translate the business requirement intuitively into the (semi) structuredlanguage to implement the details of basic algorithm. The development difficulty of parallel computing can be reduced effectively.]]>
Besides retrieval, the computingresult of esProc can also be written back to the original data source or data sources of various types, or written into multiple data sources at the same time. Similarly, esProc has the inbuilt functions to write back various data sources, including modifying a single record, and writing back the massive data.
As esProc provides the consistent JDBC interface for upper levelapplications, esProc and data sources can co-build the easy-to-use hybrid database. In the past, multi-data-source computing requires the high-end reporting tools, hard-to-maintain ETL, and expensive data warehouse. esProc isn’t binded with any specific data source, but it supports the combined computing of various data sources by nature. esProc can reduce the difficulty of corelating the big data and the traditional database, remove the restriction on single-source report, and enable Java application to confront the increasingly complexdata environment.]]>
For example, to count the trading days in which a stock has been rising consecutively. The typical SQL statements would be:
With the agile syntax, the corresponding esProc code can be as simple asbelow:
Unlike the normal text code, the esProc code is written in the cellset (i.e. grid), as shown in the below:
esProc has an Excel-like grid user interface. With it, users can execute the codes in the grid from left to right, top to bottom. Each cell is named after the combination of its column and row numbers, which is unique and natural. esProc users can reference the cell name directly, without defining any variables.
On the grid-style interface, codes are presented with natural format, natural alignment, and natural indentation, no deed to typeset. The grid-style presentation of codes not only makes the code neat and clean, but also offers the users an intuitive way to organize and arrange the relationship between the computing steps.
Based on step-by-step computing model, the grid-style code allows users to monitor, think, and write the codes from the business perspective. The problem can then be solved step by step by decomposing the complex computinggoal into several steps. Each step canreference the previous computing result.Meanwhile, the esProc users can benefit from the step-by-step mechanism to detect and correct errors easily.
Compared with esProc, SQL solution is more complex as it doesn’t support step-by-step computing. Java/VB and other senior languages support step-by-step computing, but they are neither better at structured data process, nor the computing languages based on database . esProc is more advantageous, and has much simpler syntax.
Rich In-built Library Functions
esProc offers abundant library functions to support the group, loop, sort, filteroperations on set, and ordered set.
Group and loop:
In the above figure, A4 is the loop statements, and B4-B10 is the internal part of the looping.
esProc supports the basic grouping, align grouping, and enum grouping operations. Unlike SQL, the grouped data in esProc is presented in the set. Each member of the set is of the generic type corresponding to a set of data. Grouping and summarizing can be conducted step by step. Thus, the grouping result can be reused.
Sorting and filtering is the typical syntax of structureddata, and esProc provides the easy-to-use loop functions for it In addition, esProc loop function minimizes the use of most looping statements, so that the programming difficulty is reduced, and the development efficiency improved.
Operations on set:
esProc supports the concatenation, union, intersection, difference, multiplying, aligning, aligning comparison, comparing, and several other operations. In addition, it also provides algorithms for retrieving member forwardly or backwardly, subset operations, copying, arranging member reversely, and transposed matrix.
The “ordered” indicates that the data is stored in a certain order. Each component of each piece of member has the absolute or relative number for users to retrieve data and perform other operations in order, such aslocating, ranking, and sorting.
esProc supports the ordered set. The serial number is a convenient device for users to retrieve the set member and perform the order-related computing easily. The ordered computing is the typical tough problems of SQL. For example, to retrieve the data in the relative position. It is referred to as the top or bottom ones (groups) of the current record (group). By taking the advantage of the ordered set mechanism of esProc, users can solve such problems easily.
Easy to develop and debug
The step-by-step computingof esProc makes it easier to monitor the intermediate result, and debug the code. The development efficiency of esProc is therefore enhanced greatly.
esProc supports the complete debugging function, including Set the break point, Run to the break point, and Step-by-step execution.SQL does not support the step-by-step execution, so SQLusers are hard to monitor the intermediate result, simplify the computing, and debug.]]>