python - can't define a udf inside pyspark project -
i have python project uses pyspark , trying define udf function inside spark project (not in python project) in spark\python\pyspark\ml\tuning.py pickling problems. can't load udf. code:
from pyspark.sql.functions import udf, log test_udf = udf(lambda x : -x[1], returntype=floattype()) d = data.withcolumn("new_col", test_udf(data["x"])) d.show()
when try d.show() getting exception of unknown attribute test_udf
in python project defined many udf , worked fine.
add following code. isn't recognizing datatype.
from pyspark.sql.types import *
let me know if helps. thanks.
Comments
Post a Comment