Preserve order while converting string array into int array in hive

Stella

I'm trying to convert a string array to int array by keeping the original order here is a sample of what my data looks like:

id       attribut                       string_array
id1      attribut1, 10283:990000       ["10283","990000"]
id2      attribut2, 10283:36741000     ["10283","36741000"]
id3      attribut3, 10283:37871000     ["10283","37871000"]
id4      attribut4, 3215:90451000      ["3215","90451000"]

and here's how i convert the field "string_array" into an array of integers

select  
id, 
attribut,
string_array,
collect_list(cast(array_explode as int)),
from table
lateral view outer explode(string_array) r as array_explode

it gives me:

id       attribut                        string_array              int_array
id1      attribut1,10283:990000         ["10283","990000"]        [990000,10283]
id2      attribut2,10283:36741000       ["10283","36741000"]      [10283,36741000]
id3      attribut3,10283:37871000       ["10283","37871000"]      [37871000,10283]
id4      attribut4,3215:90451000        ["3215","90451000"]       [90451000,3215]

As you can see, the order in "string array" has not been preserved in "int_array" and I need it to be exactly the same as in "string_array". anyone know how to achieve this ?

Any help would be much appreciated

leftjoin

For Hive: Use posexplode, in a subquery before collect_list do distribute by id sort by position

select  
id, 
attribut,
string_array,
collect_list(cast(element as int)),
from
(select * 
  from table t
       lateral view outer posexplode(string_array) e as pos,element 
  distribute by t.id, attribut, string_array -- distribute by group key
  sort by pos        -- sort by initial position
) t
group by id, attribut, string_array

Another way is to extract substring from your attributes and split without exploding (as you asked in the comment)

select split(regexp_extract(attribut, '[^,]+,(.*)$',1),':')

Regexp '[^,]+,(.*)$' means:

[^,]+ - not a comma 1+ times , - comma (.*)$ - everything else in catpturing group 1 after comma till the end of the string

Demo:

select split(regexp_extract('attribut3,10283:37871000', '[^,]+,(.*)$',1),':')

Result:

["10283","37871000"]

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Converting a String to an Int Array

Converting an int array to a String array

Preserve quotes when converting array of string

Converting string to int from array

How to preserve number's precision while converting list to numpy array

Converting string array into int array in python

Converting a String array into an int Array in java

Converting string to character array to int array to linkedlist

Converting a String Array into an Int Array in C#

Why do I get NumberFormatException while converting String Array to int Array?

Preserve Nested Array Structure When Converting to String, JavaScript

Converting a string into Unsigned int 8 array

Converting a String from Byte array to int

Converting String array to int (months into numbers)

Where "null" and "undefined" in array goes while converting array to string?

Hive converting array<string, string> to array<struct<key:string, value:string>> with custom udf

Converting an int array of Drawable images to a string array of their filenames

C# Converting Problems with String Array to Int Array

Converting int array to char array

Issue in converting float to int while calculating median of two sorted array

Converting String Array to an Integer Array

Converting string of an array back to array

Converting String Array Back to an Array

Converting object array to string array

Converting string array to process array

Converting array inside a string to an array

Converting string array to char array

Converting string array into float array

Converting string array to double array

TOP Ranking

  1. 1

    Failed to listen on localhost:8000 (reason: Cannot assign requested address)

  2. 2

    How to import an asset in swift using Bundle.main.path() in a react-native native module

  3. 3

    Loopback Error: connect ECONNREFUSED 127.0.0.1:3306 (MAMP)

  4. 4

    pump.io port in URL

  5. 5

    Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

  6. 6

    BigQuery - concatenate ignoring NULL

  7. 7

    ngClass error (Can't bind ngClass since it isn't a known property of div) in Angular 11.0.3

  8. 8

    Do Idle Snowflake Connections Use Cloud Services Credits?

  9. 9

    maven-jaxb2-plugin cannot generate classes due to two declarations cause a collision in ObjectFactory class

  10. 10

    Compiler error CS0246 (type or namespace not found) on using Ninject in ASP.NET vNext

  11. 11

    Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

  12. 12

    Generate random UUIDv4 with Elm

  13. 13

    Jquery different data trapped from direct mousedown event and simulation via $(this).trigger('mousedown');

  14. 14

    Is it possible to Redo commits removed by GitHub Desktop's Undo on a Mac?

  15. 15

    flutter: dropdown item programmatically unselect problem

  16. 16

    Change dd-mm-yyyy date format of dataframe date column to yyyy-mm-dd

  17. 17

    EXCEL: Find sum of values in one column with criteria from other column

  18. 18

    Pandas - check if dataframe has negative value in any column

  19. 19

    How to use merge windows unallocated space into Ubuntu using GParted?

  20. 20

    Make a B+ Tree concurrent thread safe

  21. 21

    ggplotly no applicable method for 'plotly_build' applied to an object of class "NULL" if statements

HotTag

Archive