1+ #hive-third-functions [ ![ Build Status] ( https://travis-ci.org/aaronshan/hive-third-functions.svg?branch=master )] ( https://travis-ci.org/aaronshan/hive-third-functions )
2+
13## Introduction
2- My udf functions
4+ some useful hive udf functions
35
46## Build
57
@@ -8,20 +10,94 @@ cd ${project_home}
810mvn clean package
911```
1012
11- It will generate hive-third-functions-1.0-SNAPSHOT-shaded.jar in target directory.
13+ If you want skip unit tests, please run:
14+ ```
15+ cd ${project_home}
16+ mvn clean package -DskipTests
17+ ```
18+
19+ It will generate hive-third-functions-${version}-shaded.jar in target directory.
20+
21+ You can also directly download file from [ release page] ( https://github.com/aaronshan/hive-third-functions/releases ) .
22+
23+ > current lastest version is ` 2.0.0 `
24+
25+ ## functions
26+
27+ ### 1. string functions
28+
29+ | function| description |
30+ | :--| :--|
31+ | pinyin(string) -> string | convert chinese to pinyin|
32+ | md5(string) -> string | md5 hash|
33+ | sha256(string) -> string | sha256 hash|
34+
35+ ### 2. array functions
36+
37+ | function| description |
38+ | :--| :--|
39+ | array_contains(array, value) -> boolean | whether ARRAY contains value or not.|
40+
41+ ### 3. date functions
42+
43+ | function| description |
44+ | :--| :--|
45+ | dayofweek(date_string \| date) -> int | day of week,if monday,return 1, sunday return 7, error return null.|
46+ | zodiac_en(date_string \| date) -> string | convert date to zodiac|
47+ | zodiac_cn(date_string \| date) -> string | convert date to zodiac chinese |
48+ | typeofdate(date_string \| date) -> string | for chinese. 获取日期的类型(1: 法定节假日, 2: 正常周末, 3: 正常工作日 4:攒假的工作日),错误返回-1. |
49+
50+ ### 4. JSON functions
51+ | function| description |
52+ | :--| :--|
53+ | json_array_get(json, jsonPath) -> array(varchar) | returns the element at the specified index into the ` json_array ` . The index is zero-based.|
54+ | json_array_length(json, jsonPath) -> array(varchar) | Returns the array length of ` json ` (a string containing a JSON array).|
55+ | json_array_extract(json, jsonPath) -> array(varchar) | extract json array by given jsonPath.|
56+ | json_array_extract_scalar(json, jsonPath) -> array(varchar) | like ` json_array_extract ` , but returns the result value as a string (as opposed to being encoded as JSON).|
57+ | json_extract(json, jsonPath) -> array(varchar) | extract json by given jsonPath.|
58+ | json_extract_scalar(json, jsonPath) -> array(varchar) | like ` json_extract ` , but returns the result value as a string (as opposed to being encoded as JSON).|
59+ | json_size(json, jsonPath) -> array(varchar) | like ` json_extract ` , but returns the size of the value. For objects or arrays, the size is the number of members, and the size of a scalar value is zero.|
60+
61+ ### 5. China Id Card functions
62+
63+ | function| description |
64+ | :--| :--|
65+ | id_card_province(string) -> string | get user's province|
66+ | id_card_city(string) -> string | get user's city|
67+ | id_card_area(string) -> string | get user's area|
68+ | id_card_birthday(string) -> string | get user's birthday|
69+ | id_card_gender(string) -> string | get user's gender|
70+ | is_valid_id_card(string) -> boolean | determine is valid china id card No.|
71+ | id_card_info(string) -> json | get china id card info. include province, city, area etc.|
1272
1373## Use
1474
1575Put these statements into ${HOME}/.hiverc or exec its on hive cli env.
1676
1777```
1878add jar ${jar_location_dir}/hive-third-functions-1.0-SNAPSHOT-shaded.jar
19- create temporary function arrayContains as 'cc.shanruifeng.functions.array.UDFArrayContains';
79+ create temporary function array_contains as 'cc.shanruifeng.functions.array.UDFArrayContains';
2080create temporary function dayOfWeek as 'cc.shanruifeng.functions.date.UDFDayOfWeek';
2181create temporary function typeOfDay as 'cc.shanruifeng.functions.date.UDFTypeOfDay';
22- create temporary function zodiacCn as 'cc.shanruifeng.functions.date.UDFZodiacSignCn';
23- create temporary function zodiacEn as 'cc.shanruifeng.functions.date.UDFZodiacSignEn';
82+ create temporary function zodiac_cn as 'cc.shanruifeng.functions.date.UDFZodiacSignCn';
83+ create temporary function zodiac_en as 'cc.shanruifeng.functions.date.UDFZodiacSignEn';
2484create temporary function pinyin as 'cc.shanruifeng.functions.string.UDFChineseToPinYin';
85+ create temporary function md5 as 'cc.shanruifeng.functions.string.UDFMd5';
86+ create temporary function sha256 as 'cc.shanruifeng.functions.string.UDFSha256';
87+ create temporary function json_array_get as 'cc.shanruifeng.functions.json.UDFJsonArrayGet';
88+ create temporary function json_array_length as 'cc.shanruifeng.functions.json.UDFJsonArrayLength';
89+ create temporary function json_array_extract as 'cc.shanruifeng.functions.json.UDFJsonArrayExtract';
90+ create temporary function json_array_extract_scalar as 'cc.shanruifeng.functions.json.UDFJsonArrayExtractScalar';
91+ create temporary function json_extract as 'cc.shanruifeng.functions.json.UDFJsonExtract';
92+ create temporary function json_extract_scalar as 'cc.shanruifeng.functions.json.UDFJsonExtractScalar';
93+ create temporary function json_size as 'cc.shanruifeng.functions.json.UDFJsonSize';
94+ create temporary function id_card_province as 'cc.shanruifeng.functions.card.UDFChinaIdCardProvince';
95+ create temporary function id_card_city as 'cc.shanruifeng.functions.card.UDFChinaIdCardCity';
96+ create temporary function id_card_area as 'cc.shanruifeng.functions.card.UDFChinaIdCardArea';
97+ create temporary function id_card_birthday as 'cc.shanruifeng.functions.card.UDFChinaIdCardBirthday';
98+ create temporary function id_card_gender as 'cc.shanruifeng.functions.card.UDFChinaIdCardGender';
99+ create temporary function is_valid_id_card as 'cc.shanruifeng.functions.card.UDFChinaIdCardValid';
100+ create temporary function id_card_info as 'cc.shanruifeng.functions.card.UDFChinaIdCardInfo';
25101```
26102
27103You can use these statements on hive cli env get detail of function.
@@ -38,4 +114,46 @@ zodiacCn(date) - from the input date string or separate month and day arguments,
38114Example:
39115 > select zodiacCn(date_string) from src;
40116 > select zodiacCn(month, day) from src;
117+ ```
118+
119+ ### example
120+ ```
121+ select pinyin('中国') => zhongguo
122+ select md5('aaronshan') => 95686bc0483262afe170b550dd4544d1
123+ select sha256('aaronshan') => d16bb375433ad383169f911afdf45e209eabfcf047ba1faebdd8f6a0b39e0a32
124+ ```
125+
126+ ```
127+ select dayOfWeek('2016-07-12') => 2
128+ select typeOfDay('2016-10-01') => 1
129+ select typeOfDay('2016-07-16') => 2
130+ select typeOfDay('2016-07-15') => 3
131+ select typeOfDay('2016-09-18') => 4
132+ select zodiac_cn('1989-01-08') => 魔羯座
133+ select zodiac_en('1989-01-08') => Capricorn
134+ ```
135+
136+ ```
137+ select id_card_info('110101198901084517') => {"area":"东城区","valid":true,"province":"北京市","gender":"男","city":"北京市"}
138+ ```
139+
140+ ```
141+ select json_array_get("[{\"a\":{\"b\":\"13\"}}, {\"a\":{\"b\":\"18\"}}, {\"a\":{\"b\":\"12\"}}]", 1); => {"a":{"b":"18"}}
142+ select json_array_get('["a", "b", "c"]', 0); => 'a'
143+ select json_array_get('["a", "b", "c"]', 1); => 'b'
144+ select json_array_get('["c", "b", "a"]', -1); => 'a'
145+ select json_array_get('["c", "b", "a"]', -2); => 'b'
146+ select json_array_get('[]', 0); => null
147+ select json_array_get('["a", "b", "c"]', 10); => null
148+ select json_array_get('["c", "b", "a"]', -10); => null
149+ select json_array_length("[{\"a\":{\"b\":\"13\"}}, {\"a\":{\"b\":\"18\"}}, {\"a\":{\"b\":\"12\"}}]"); => 3
150+ select json_array_extract("[{\"a\":{\"b\":\"13\"}}, {\"a\":{\"b\":\"18\"}}, {\"a\":{\"b\":\"12\"}}]", "$.a.b"); => ["\"13\"","\"18\"","\"12\""]
151+ select json_array_extract_scalar("[{\"a\":{\"b\":\"13\"}}, {\"a\":{\"b\":\"18\"}}, {\"a\":{\"b\":\"12\"}}]", "$.a.b") => ["13","18","12"]
152+ select json_extract("{\"a\":{\"b\":\"12\"}}", "$.a.b"); => "12"
153+ select json_extract_scalar("{\"a\":{\"b\":\"12\"}}", "$.a.b") => 12
154+ select json_extract_scalar('[1, 2, 3]', '$[2]');
155+ select json_extract_scalar(json, '$.store.book[0].author');
156+ select json_size('{"x": {"a": 1, "b": 2}}', '$.x'); => 2
157+ select json_size('{"x": [1, 2, 3]}', '$.x'); => 3
158+ select json_size('{"x": {"a": 1, "b": 2}}', '$.x.a'); => 0
41159```
0 commit comments