Gimel Data API

Gimel provides unified Data API to access data from any storage like HDFS, GS, Alluxio, Hbase, Aerospike, BigQuery, Druid, Elastic, Teradata, Oracle, MySQL, SFTP, etc.


Contents


Stack & Version Compatibility

Compute/Storage/Language Version Grade Documentation Notes
2.11.8 PRODUCTION
Data API is built on scala 2.11.8
regardless the library should be compatible as long as the spark major version of library and the environment match
2.6.6 PRODUCTION PySpark Support Data API / GSQL works fully well with PySpark as long as spark version in environment & Gimel library matches.
2.2.0 This is the recommended version
2.7.3 This is the recommended version
2.7.3 PRODUCTION CSV Reader Doc CSV Reader & Writer for HDFS
2.7.3 PRODUCTION WITH LIMITATIONS Restful/Web-API Doc
Allows Accessing Data
- to any source supporting
- Rest API
2.7.3 PRODUCTION WITH LIMITATIONS Cross-Cluster Doc
Allows Accessing Data
- Across Clusters
- Allxio
0.10.2 PRODUCTION Kafka Doc V0.10.2 is the PayPal's Supported Version of Kafka
1.2 PRODUCTION WITH LIMITATIONS HBASE Doc Leverages SHC Connector internally & also supports Batch/Get/Puts
3.14 EXPERIMENTAL Aerospike Doc Experimental API for Aerospike reads / writes
2.0 EXPERIMENTAL Cassandra Doc
Experimental API for Cassandra reads / writes
Leverages DataStax Connector
5.6.4 PRODUCTION ElasticSearch Doc Has Special Support for PayPal's Daily ES indexes
1.2 PRODUCTION Hive Doc
1.6.2 EXPERIMENTAL Teradata Doc
EXPERIMENTAL API Only
Uses JDBC Connector internally
0.82 PRODUCTION Druid Doc Only Writes(Non-Batch Mode)
0.82 PRODUCTION SFTP Doc Read/Write files from/To SFTP server
1.0 PRODUCTION GSQL Doc Refer link for using GSQL (Gimel SQL) API


TestSuite
1.0 PRODUCTION Test-Suite Doc Current Implementation works with CataLog Provider - Hive

Gimel Logging
0.4.2 PRODUCTION Gimel Logging Doc This is the Gimel Logging Framework

Unified Data Catalog
0.0.1 PRODUCTION UDC Doc This is Unified Data Catalog

Questions