Skip to content

octaviansima/opaque

 
 

Repository files navigation

Opaque

Secure Apache Spark SQL

Tests Status License Contributor Covenant

Welcome to the landing page of Opaque SQL! Opaque SQL is a package for Apache Spark SQL that enables processing over encrypted DataFrames using the OpenEnclave framework.

Quick start

To quickly get started with Opaque SQL, you can download our Docker image (also includes other open source projects in the MC2 project).

docker pull mc2project/mc2
docker run -it -p 22:22 -p 50051-50055:50051-50055 -w /root mc2project/mc2

Change into the Opaque directory and export the Opaque and OpenEnclave environment variables.

cd opaque
source opaqueenv
source /opt/openenclave/share/openenclave/openenclaverc
export MODE=SIMULATE

You are now ready to run your first Opaque SQL query! First, start a Scala shell:

build/sbt console

Next, import Opaque's DataFrame methods:

import edu.berkeley.cs.rise.opaque.implicits._
edu.berkeley.cs.rise.opaque.Utils.initOpaqueSQL(spark)

To convert an existing DataFrame into an Opaque encrypted DataFrame, simply call .encrypted:

val data = Seq(("foo", 4), ("bar", 1), ("baz", 5))
val df = spark.createDataFrame(data).toDF("word", "count")
val dfEncrypted = df.encrypted

You can use the same Spark SQL API to query the encrypted DataFrame:

val result = dfEncrypted.filter($"count" > lit(3))
result.explain(true)
// [...]
// == Optimized Logical Plan ==
// EncryptedFilter (count#6 > 3)
// +- EncryptedLocalRelation [word#5, count#6]
// [...]

Congrats, you've run your first encrypted query using Opaque SQL!

Documentation

For more details on building, using, and contributing, please see our documentation.

Paper

The open source is based on our NSDI 2017 paper.

About

An encrypted data analytics platform

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Scala 44.6%
  • C++ 41.8%
  • Assembly 5.5%
  • Shell 4.8%
  • C 1.6%
  • CMake 1.0%
  • Other 0.7%