How to use Python UDF for Hive as inline code without script fileHow do I check whether a file exists without exceptions?How do I copy a file in Python?How can I safely create a nested directory?How to get the current time in PythonHow to print without newline or space?How can I make a time delay in Python?How do I concatenate two lists in Python?How do I list all files of a directory?How to read a file line-by-line into a list?How do you append to a file in Python?
How can Caller ID be faked?
100-doors puzzle
Is it a bad idea to have an pen name with only an initial for a surname?
Is there a term for someone whose preferred policies are a mix of Left and Right?
Cant bend fingertip when finger is straight
What is the difference between state-based effects and effects on the stack?
SQL Server has encountered occurences of I/O requests taking longer than 15 seconds
My parents claim they cannot pay for my college education; what are my options?
Why not make one big CPU core?
How to know whether to write accidentals as sharps or flats?
Does anyone recognize these rockets, and their location?
Someone who is granted access to information but not expected to read it
Are athletes' college degrees discounted by employers and graduate school admissions?
Can artificial satellite positions affect tides?
Is it unethical to quit my job during company crisis?
The title "Mord mit Aussicht" explained
...and then she held the gun
Can a 40amp breaker be used safely and without issue with a 40amp device on 6AWG wire?
Why is gun control associated with the socially liberal Democratic party?
Struggling to present results from long papers in short time slots
Will users know a CardView is clickable
Is there a risk to write an invitation letter for a stranger to obtain a Czech (Schengen) visa?
Does an African-American baby born in Youngstown, Ohio have a higher infant mortality rate than a baby born in Iran?
How could I create a situation in which a PC has to make a saving throw or be forced to pet a dog?
How to use Python UDF for Hive as inline code without script file
How do I check whether a file exists without exceptions?How do I copy a file in Python?How can I safely create a nested directory?How to get the current time in PythonHow to print without newline or space?How can I make a time delay in Python?How do I concatenate two lists in Python?How do I list all files of a directory?How to read a file line-by-line into a list?How do you append to a file in Python?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I'm using Python package impyla
to connect to Hive programmatically; I'm not using the hive
CLI. And I'm trying to use a UDF written in Python.
All tutorials I've seen do this like this
ADD FILE myscript.py;
...
SELECT TRANSFORM (cols...)
USING 'python myscript.py'
AS ...
I thought the USING
part can be any executable program that does the right thing. As such I was thinking I could send the script on-the-fly as string like so
USING 'python -c "import sys; ..."'
This would nicely avoid dealing with file transmission to Hadoop. However, I have trouble getting this to work.
After the useful code didn't work, I reduced to this dummy code
USING 'python -c "print 3"
just to debug. The error I'm getting is
E impala.error.HiveServer2Error: Error while compiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string
more details are
test_hive_udf_example.py:77:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../src/sunny/sql/hive.py:80: in read
return super().read(sql, configuration=config)
../../src/sunny/sql/sql.py:136: in read
self._execute(sql, **kwargs)
../../src/sunny/sql/sql.py:130: in _execute
self._cursor.execute(sql, **kwargs)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:302: in execute
configuration=configuration)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:343: in execute_async
self._execute_async(op)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:362: in _execute_async
operation_fn()
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:340: in op
async=True)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:1027: in execute
return self._operation('ExecuteStatement', req)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:957: in _operation
resp = self._rpc(kind, request)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:925: in _rpc
err_if_rpc_not_ok(response)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
resp = TExecuteStatementResp(status=TStatus(statusCode=3, infoMessages=None, sqlState='42000', errorCode=40000, errorMessage=...mpiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string'), operationHandle=None)
def err_if_rpc_not_ok(resp):
if (resp.status.statusCode != TStatusCode.SUCCESS_STATUS and
resp.status.statusCode != TStatusCode.SUCCESS_WITH_INFO_STATUS and
resp.status.statusCode != TStatusCode.STILL_EXECUTING_STATUS):
> raise HiveServer2Error(resp.status.errorMessage)
E impala.error.HiveServer2Error: Error while compiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string
What it seems to complain about is that it is looking for the script, which my code did not mention at all.
My executing this script by the hive
CLI instead of impyla
, I verified that using a script is not required by the syntax. USING 'python -c "..."'
could work.
Now the problem seems to be in how I use it via impyla
.
Any pointer is welcome! Thanks!
python hive user-defined-functions impyla
add a comment |
I'm using Python package impyla
to connect to Hive programmatically; I'm not using the hive
CLI. And I'm trying to use a UDF written in Python.
All tutorials I've seen do this like this
ADD FILE myscript.py;
...
SELECT TRANSFORM (cols...)
USING 'python myscript.py'
AS ...
I thought the USING
part can be any executable program that does the right thing. As such I was thinking I could send the script on-the-fly as string like so
USING 'python -c "import sys; ..."'
This would nicely avoid dealing with file transmission to Hadoop. However, I have trouble getting this to work.
After the useful code didn't work, I reduced to this dummy code
USING 'python -c "print 3"
just to debug. The error I'm getting is
E impala.error.HiveServer2Error: Error while compiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string
more details are
test_hive_udf_example.py:77:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../src/sunny/sql/hive.py:80: in read
return super().read(sql, configuration=config)
../../src/sunny/sql/sql.py:136: in read
self._execute(sql, **kwargs)
../../src/sunny/sql/sql.py:130: in _execute
self._cursor.execute(sql, **kwargs)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:302: in execute
configuration=configuration)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:343: in execute_async
self._execute_async(op)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:362: in _execute_async
operation_fn()
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:340: in op
async=True)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:1027: in execute
return self._operation('ExecuteStatement', req)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:957: in _operation
resp = self._rpc(kind, request)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:925: in _rpc
err_if_rpc_not_ok(response)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
resp = TExecuteStatementResp(status=TStatus(statusCode=3, infoMessages=None, sqlState='42000', errorCode=40000, errorMessage=...mpiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string'), operationHandle=None)
def err_if_rpc_not_ok(resp):
if (resp.status.statusCode != TStatusCode.SUCCESS_STATUS and
resp.status.statusCode != TStatusCode.SUCCESS_WITH_INFO_STATUS and
resp.status.statusCode != TStatusCode.STILL_EXECUTING_STATUS):
> raise HiveServer2Error(resp.status.errorMessage)
E impala.error.HiveServer2Error: Error while compiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string
What it seems to complain about is that it is looking for the script, which my code did not mention at all.
My executing this script by the hive
CLI instead of impyla
, I verified that using a script is not required by the syntax. USING 'python -c "..."'
could work.
Now the problem seems to be in how I use it via impyla
.
Any pointer is welcome! Thanks!
python hive user-defined-functions impyla
add a comment |
I'm using Python package impyla
to connect to Hive programmatically; I'm not using the hive
CLI. And I'm trying to use a UDF written in Python.
All tutorials I've seen do this like this
ADD FILE myscript.py;
...
SELECT TRANSFORM (cols...)
USING 'python myscript.py'
AS ...
I thought the USING
part can be any executable program that does the right thing. As such I was thinking I could send the script on-the-fly as string like so
USING 'python -c "import sys; ..."'
This would nicely avoid dealing with file transmission to Hadoop. However, I have trouble getting this to work.
After the useful code didn't work, I reduced to this dummy code
USING 'python -c "print 3"
just to debug. The error I'm getting is
E impala.error.HiveServer2Error: Error while compiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string
more details are
test_hive_udf_example.py:77:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../src/sunny/sql/hive.py:80: in read
return super().read(sql, configuration=config)
../../src/sunny/sql/sql.py:136: in read
self._execute(sql, **kwargs)
../../src/sunny/sql/sql.py:130: in _execute
self._cursor.execute(sql, **kwargs)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:302: in execute
configuration=configuration)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:343: in execute_async
self._execute_async(op)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:362: in _execute_async
operation_fn()
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:340: in op
async=True)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:1027: in execute
return self._operation('ExecuteStatement', req)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:957: in _operation
resp = self._rpc(kind, request)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:925: in _rpc
err_if_rpc_not_ok(response)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
resp = TExecuteStatementResp(status=TStatus(statusCode=3, infoMessages=None, sqlState='42000', errorCode=40000, errorMessage=...mpiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string'), operationHandle=None)
def err_if_rpc_not_ok(resp):
if (resp.status.statusCode != TStatusCode.SUCCESS_STATUS and
resp.status.statusCode != TStatusCode.SUCCESS_WITH_INFO_STATUS and
resp.status.statusCode != TStatusCode.STILL_EXECUTING_STATUS):
> raise HiveServer2Error(resp.status.errorMessage)
E impala.error.HiveServer2Error: Error while compiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string
What it seems to complain about is that it is looking for the script, which my code did not mention at all.
My executing this script by the hive
CLI instead of impyla
, I verified that using a script is not required by the syntax. USING 'python -c "..."'
could work.
Now the problem seems to be in how I use it via impyla
.
Any pointer is welcome! Thanks!
python hive user-defined-functions impyla
I'm using Python package impyla
to connect to Hive programmatically; I'm not using the hive
CLI. And I'm trying to use a UDF written in Python.
All tutorials I've seen do this like this
ADD FILE myscript.py;
...
SELECT TRANSFORM (cols...)
USING 'python myscript.py'
AS ...
I thought the USING
part can be any executable program that does the right thing. As such I was thinking I could send the script on-the-fly as string like so
USING 'python -c "import sys; ..."'
This would nicely avoid dealing with file transmission to Hadoop. However, I have trouble getting this to work.
After the useful code didn't work, I reduced to this dummy code
USING 'python -c "print 3"
just to debug. The error I'm getting is
E impala.error.HiveServer2Error: Error while compiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string
more details are
test_hive_udf_example.py:77:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../src/sunny/sql/hive.py:80: in read
return super().read(sql, configuration=config)
../../src/sunny/sql/sql.py:136: in read
self._execute(sql, **kwargs)
../../src/sunny/sql/sql.py:130: in _execute
self._cursor.execute(sql, **kwargs)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:302: in execute
configuration=configuration)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:343: in execute_async
self._execute_async(op)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:362: in _execute_async
operation_fn()
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:340: in op
async=True)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:1027: in execute
return self._operation('ExecuteStatement', req)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:957: in _operation
resp = self._rpc(kind, request)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:925: in _rpc
err_if_rpc_not_ok(response)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
resp = TExecuteStatementResp(status=TStatus(statusCode=3, infoMessages=None, sqlState='42000', errorCode=40000, errorMessage=...mpiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string'), operationHandle=None)
def err_if_rpc_not_ok(resp):
if (resp.status.statusCode != TStatusCode.SUCCESS_STATUS and
resp.status.statusCode != TStatusCode.SUCCESS_WITH_INFO_STATUS and
resp.status.statusCode != TStatusCode.STILL_EXECUTING_STATUS):
> raise HiveServer2Error(resp.status.errorMessage)
E impala.error.HiveServer2Error: Error while compiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string
What it seems to complain about is that it is looking for the script, which my code did not mention at all.
My executing this script by the hive
CLI instead of impyla
, I verified that using a script is not required by the syntax. USING 'python -c "..."'
could work.
Now the problem seems to be in how I use it via impyla
.
Any pointer is welcome! Thanks!
python hive user-defined-functions impyla
python hive user-defined-functions impyla
edited Mar 25 at 5:25
zpz
asked Mar 25 at 2:22
zpzzpz
78110
78110
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55330516%2fhow-to-use-python-udf-for-hive-as-inline-code-without-script-file%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55330516%2fhow-to-use-python-udf-for-hive-as-inline-code-without-script-file%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown