sqlite - No such table while writing to sqlite3 database from Pyspark via JDBC -
i trying write spark dataframe sqlite3 database in python using sqlite-jdbc xerial , this example. i getting error
java.sql.sqlexception: [sqlite_error] sql error or missing database (no such table: test)
the database file hello.db
created table test
has schema
sqlite> .schema test create table test (age bigint , name text );
i running spark-submit --jars ../extras/sqlite-jdbc-3.8.11.2.jar example.py
in order find driver.
i running spark 1.6.0.
(hopefully) reproducible example
import os os.environ["spark_home"] = "/usr/lib/spark" import findspark findspark.init() pyspark import sparkconf, sparkcontext pyspark.sql import sqlcontext config = { "spark.cores.max": "5", "spark.master" : "spark://master2:7077", "spark.python.profile": "false", "spark.ui.enabled": "false", "spark.executor.extraclasspath": "../extras/sqlite-jdbc-3.8.11.2.jar", "spark.driver.extraclasspath": "../extras/sqlite-jdbc-3.8.11.2.jar", "spark.jars": "../extras/sqlite-jdbc-3.8.11.2.jar" } conf = sparkconf() key, value in config.iteritems(): conf = conf.set(key, value) sc = sparkcontext(appname="test", conf=conf) sqlcontext = sqlcontext(sc) d = [{'name': 'alice', 'age': 31}] df = sqlcontext.createdataframe(d) url = "jdbc:sqlite:hello.db" df.write.jdbc(url=url, table="test", mode="overwrite", properties={"driver":"org.sqlite.jdbc"})
in general each spark executor performs reads , writes separately data source , sink has accessible each worker node. in general makes sqlite rather useless in scenario (it great local lookups though).
if want store output in database in non-local mode you'll need proper database server.
Comments
Post a Comment