High development efficiency. esProc is a tool to script in the grid. With esProc, the computing logics can be laid out in the 2D space conveniently, so that the business algorithm can be interpreted with the computer language more easily. The grid supports the step-by-step computing by nature. Since each cell represents a computing unit or step, esProc is good at converting the complicated business logics of ETL into simple steps. The grid implements an intuitive view of the code indentation and the work scope, and streamlines the cell reference and reuse. Users can reference any cell with the its native cell name, which means, they need not to define the variables. By clicking cells, users can monitor the computed result intuitively, needless to search in a list of variables. The grid also offers the true debugging functions for ETL.
Agile algorithm. With the support for true set data type, esProc can simplify the computing on structured data to facilitate the flexible computing from the business prospective. esProc supports the ordered set, capable of accessing the set member and performing the serial-number-related computing, for example, ranking, sorting, year-over-year comparison, and link relative ratio comparison. With the ideal “set of set” mechanism to represent grouping, esProc can be used to solve various grouping problems easily, including those problems involving the equal, align, and enum groupings. In addition, users can operate the individual records in the data set in the same way as operating the object. Such disassociated record will give users a much more flexible and free access experience than ever. For many ETL computings that are tough for SQL/SP, esProc can represent and solve them quite easily with its agile syntax.
Abundant support for various data sources. esProc allows for result write-back to multiple or single data source, and supports computing over different data sources, including all kinds of databases and the non-database file data sources. It offers abundant functions to support structured data and non-structured data computing. Additionally, esProc supports the native file data and those from remote file on LANn the local files, and the big data file in the distributed file system of HDFSn and both the common txt file or Excel sheets, and the particular files in private form but with great performance.
Convenient Command Line Scheduling. Users can execute the esProc scripts with command line directly, set the regular launch with the schedule function provided by OS, and perform the ETL tasks on the OS of various editions and versions of Windows, Linux, Unix, or Mac. esProc also provides JDBC API for users to implement the much more flexible scheduling by coding. With JDBC, users can manipulate the ETL tasks more flexibly.
Parallel frame speed the ETL process. esProc supports the parallel computing over big data, and is capable of accomplishing the ETL task involving TBs of data in HDFS file or database. With the parallel computing frame, massive amount of data can be allocated to multiple computing nodes equally. Each node only needs to undertake the a few data computing. esProc supports the distributed computing at multiple levels. Each node can act as either the main node for allocating and summarizing, or the sub node for computing in details. The node machine can be the high-end server or inexpensive PC of the Windows client or Linux server.