Avro data is not converted SparkConvert Xml to Avro from Kafka to hdfs via spark streaming or flumeConverting data into Parquet in SparkConverting Json string to Avro in scala results in ClassCastExceptionAvro format deserialization in Spark structured streamConvert a spark dataframe Row to Avro and publish to kakfaStoring Kafka Avro serialzed data to parquet file using Spark streamingConsume Avro event in spark streaming and create data-frameUnable to deserialize avro data using direct kafka stream in spark streamingHow can I set a logicalType in a spark-avro 2.4 schema?

What Linux Kernel runs Ubuntu 18.04.3 LTS

Is there a commercial liquid with refractive index greater than n=2?

Playing a fast but quiet Alberti bass

Are there any OR challenges that are similar to kaggle's competitions?

How to fix Sprinkles in rendering?

What happened after the end of the Truman Show?

Can I check a small array of bools in one go?

How can I train a replacement without them knowing?

Indirect speech - breaking the rules of it

Earliest evidence of objects intended for future archaeologists?

Starships without computers?

From France west coast to Portugal via ship?

Tabularx with hline and overrightarrow vertical spacing

What causes burn marks on the air handler in the attic?

Why don't politicians push for fossil fuel reduction by pointing out their scarcity?

Can sulfuric acid itself be electrolysed?

What is "super" in superphosphate?

Why do aircraft leave cruising altitude long before landing just to circle?

What is the evidence on the danger of feeding whole blueberries and grapes to infants and toddlers?

How to translate 脑袋短路 into English?

How to detect a failed AES256 decryption programmatically?

Build a mob of suspiciously happy lenny faces ( ͡° ͜ʖ ͡°)

Best model for precedence constraints within scheduling problem

Is recepted a word?



Avro data is not converted Spark


Convert Xml to Avro from Kafka to hdfs via spark streaming or flumeConverting data into Parquet in SparkConverting Json string to Avro in scala results in ClassCastExceptionAvro format deserialization in Spark structured streamConvert a spark dataframe Row to Avro and publish to kakfaStoring Kafka Avro serialzed data to parquet file using Spark streamingConsume Avro event in spark streaming and create data-frameUnable to deserialize avro data using direct kafka stream in spark streamingHow can I set a logicalType in a spark-avro 2.4 schema?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















I have written one of the Spark data frame columns into Kafka in Avro format. Then I try to read the data from this topic and convert from Avro to the data frame column. The type of the data is a timestamp and instead of the timestamps from the database, I get some default values:



1970-01-01 00:00:00
1970-01-01 00:00:00
1970-01-01 00:00:00
1970-01-01 00:00:00
1970-01-01 00:00:00
1970-01-01 00:00:00
1970-01-01 00:00:00
1970-01-01 00:00:00
1970-01-01 00:00:00
1970-01-01 00:00:00


The same behavior can be noticed with columns of other data types, like String. Initial timestamp value looks like this and this the result I want to obtain:



2019-03-19 12:26:03.003
2019-03-19 12:26:09
2019-03-19 12:27:04.003
2019-03-19 12:27:08.007
2019-03-19 12:28:01.013
2019-03-19 12:28:05.007
2019-03-19 12:28:09.023
2019-03-19 12:29:04.003
2019-03-19 12:29:07.047
2019-03-19 12:30:00.003


And here is the same data after conversion to Avro:



00 F0 E1 9B BC B3 9C C2 05
00 80 E9 F7 C1 B3 9C C2 05
00 F0 86 B2 F6 B3 9C C2 05
00 B0 E9 9A FA B3 9C C2 05
00 90 A4 E1 AC B4 9C C2 05
00 B0 EA C8 B0 B4 9C C2 05
00 B0 88 B3 B4 B4 9C C2 05
00 F0 BE EA E8 B4 9C C2 05
00 B0 89 DE EB B4 9C C2 05
00 F0 B6 9E 9E B5 9C C2 05


What can I do to fix this conversion problem?



The code for writing Avro into Kafka, reading it and converting back to the data frame. I tried to use to_avro and from_avro Spark-avro methods:



import org.apache.spark.sql.avro._

val castDF = testDataDF.select(to_avro(testDataDF.col("update_database_time")) as 'value)

castDF
.write
.format("kafka")
.option("kafka.bootstrap.servers", bootstrapServers)
.option("topic", "app_state_test")
.save()

val cachedDf = spark
.read
.format("kafka")
.option("kafka.bootstrap.servers", bootstrapServers)
.option("subscribe", "app_state_test")
.load()

val jsonSchema = ""name": "update_database_time", "type": "long", "logicalType": "timestamp-millis", "default": "NONE""
cachedDf.select(from_avro(cachedDf.col("value"), jsonSchema) as 'test)









share|improve this question






























    1















    I have written one of the Spark data frame columns into Kafka in Avro format. Then I try to read the data from this topic and convert from Avro to the data frame column. The type of the data is a timestamp and instead of the timestamps from the database, I get some default values:



    1970-01-01 00:00:00
    1970-01-01 00:00:00
    1970-01-01 00:00:00
    1970-01-01 00:00:00
    1970-01-01 00:00:00
    1970-01-01 00:00:00
    1970-01-01 00:00:00
    1970-01-01 00:00:00
    1970-01-01 00:00:00
    1970-01-01 00:00:00


    The same behavior can be noticed with columns of other data types, like String. Initial timestamp value looks like this and this the result I want to obtain:



    2019-03-19 12:26:03.003
    2019-03-19 12:26:09
    2019-03-19 12:27:04.003
    2019-03-19 12:27:08.007
    2019-03-19 12:28:01.013
    2019-03-19 12:28:05.007
    2019-03-19 12:28:09.023
    2019-03-19 12:29:04.003
    2019-03-19 12:29:07.047
    2019-03-19 12:30:00.003


    And here is the same data after conversion to Avro:



    00 F0 E1 9B BC B3 9C C2 05
    00 80 E9 F7 C1 B3 9C C2 05
    00 F0 86 B2 F6 B3 9C C2 05
    00 B0 E9 9A FA B3 9C C2 05
    00 90 A4 E1 AC B4 9C C2 05
    00 B0 EA C8 B0 B4 9C C2 05
    00 B0 88 B3 B4 B4 9C C2 05
    00 F0 BE EA E8 B4 9C C2 05
    00 B0 89 DE EB B4 9C C2 05
    00 F0 B6 9E 9E B5 9C C2 05


    What can I do to fix this conversion problem?



    The code for writing Avro into Kafka, reading it and converting back to the data frame. I tried to use to_avro and from_avro Spark-avro methods:



    import org.apache.spark.sql.avro._

    val castDF = testDataDF.select(to_avro(testDataDF.col("update_database_time")) as 'value)

    castDF
    .write
    .format("kafka")
    .option("kafka.bootstrap.servers", bootstrapServers)
    .option("topic", "app_state_test")
    .save()

    val cachedDf = spark
    .read
    .format("kafka")
    .option("kafka.bootstrap.servers", bootstrapServers)
    .option("subscribe", "app_state_test")
    .load()

    val jsonSchema = ""name": "update_database_time", "type": "long", "logicalType": "timestamp-millis", "default": "NONE""
    cachedDf.select(from_avro(cachedDf.col("value"), jsonSchema) as 'test)









    share|improve this question


























      1












      1








      1








      I have written one of the Spark data frame columns into Kafka in Avro format. Then I try to read the data from this topic and convert from Avro to the data frame column. The type of the data is a timestamp and instead of the timestamps from the database, I get some default values:



      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00


      The same behavior can be noticed with columns of other data types, like String. Initial timestamp value looks like this and this the result I want to obtain:



      2019-03-19 12:26:03.003
      2019-03-19 12:26:09
      2019-03-19 12:27:04.003
      2019-03-19 12:27:08.007
      2019-03-19 12:28:01.013
      2019-03-19 12:28:05.007
      2019-03-19 12:28:09.023
      2019-03-19 12:29:04.003
      2019-03-19 12:29:07.047
      2019-03-19 12:30:00.003


      And here is the same data after conversion to Avro:



      00 F0 E1 9B BC B3 9C C2 05
      00 80 E9 F7 C1 B3 9C C2 05
      00 F0 86 B2 F6 B3 9C C2 05
      00 B0 E9 9A FA B3 9C C2 05
      00 90 A4 E1 AC B4 9C C2 05
      00 B0 EA C8 B0 B4 9C C2 05
      00 B0 88 B3 B4 B4 9C C2 05
      00 F0 BE EA E8 B4 9C C2 05
      00 B0 89 DE EB B4 9C C2 05
      00 F0 B6 9E 9E B5 9C C2 05


      What can I do to fix this conversion problem?



      The code for writing Avro into Kafka, reading it and converting back to the data frame. I tried to use to_avro and from_avro Spark-avro methods:



      import org.apache.spark.sql.avro._

      val castDF = testDataDF.select(to_avro(testDataDF.col("update_database_time")) as 'value)

      castDF
      .write
      .format("kafka")
      .option("kafka.bootstrap.servers", bootstrapServers)
      .option("topic", "app_state_test")
      .save()

      val cachedDf = spark
      .read
      .format("kafka")
      .option("kafka.bootstrap.servers", bootstrapServers)
      .option("subscribe", "app_state_test")
      .load()

      val jsonSchema = ""name": "update_database_time", "type": "long", "logicalType": "timestamp-millis", "default": "NONE""
      cachedDf.select(from_avro(cachedDf.col("value"), jsonSchema) as 'test)









      share|improve this question














      I have written one of the Spark data frame columns into Kafka in Avro format. Then I try to read the data from this topic and convert from Avro to the data frame column. The type of the data is a timestamp and instead of the timestamps from the database, I get some default values:



      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00
      1970-01-01 00:00:00


      The same behavior can be noticed with columns of other data types, like String. Initial timestamp value looks like this and this the result I want to obtain:



      2019-03-19 12:26:03.003
      2019-03-19 12:26:09
      2019-03-19 12:27:04.003
      2019-03-19 12:27:08.007
      2019-03-19 12:28:01.013
      2019-03-19 12:28:05.007
      2019-03-19 12:28:09.023
      2019-03-19 12:29:04.003
      2019-03-19 12:29:07.047
      2019-03-19 12:30:00.003


      And here is the same data after conversion to Avro:



      00 F0 E1 9B BC B3 9C C2 05
      00 80 E9 F7 C1 B3 9C C2 05
      00 F0 86 B2 F6 B3 9C C2 05
      00 B0 E9 9A FA B3 9C C2 05
      00 90 A4 E1 AC B4 9C C2 05
      00 B0 EA C8 B0 B4 9C C2 05
      00 B0 88 B3 B4 B4 9C C2 05
      00 F0 BE EA E8 B4 9C C2 05
      00 B0 89 DE EB B4 9C C2 05
      00 F0 B6 9E 9E B5 9C C2 05


      What can I do to fix this conversion problem?



      The code for writing Avro into Kafka, reading it and converting back to the data frame. I tried to use to_avro and from_avro Spark-avro methods:



      import org.apache.spark.sql.avro._

      val castDF = testDataDF.select(to_avro(testDataDF.col("update_database_time")) as 'value)

      castDF
      .write
      .format("kafka")
      .option("kafka.bootstrap.servers", bootstrapServers)
      .option("topic", "app_state_test")
      .save()

      val cachedDf = spark
      .read
      .format("kafka")
      .option("kafka.bootstrap.servers", bootstrapServers)
      .option("subscribe", "app_state_test")
      .load()

      val jsonSchema = ""name": "update_database_time", "type": "long", "logicalType": "timestamp-millis", "default": "NONE""
      cachedDf.select(from_avro(cachedDf.col("value"), jsonSchema) as 'test)






      apache-spark apache-kafka spark-avro






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 27 at 13:49









      CassieCassie

      82615 silver badges28 bronze badges




      82615 silver badges28 bronze badges

























          0






          active

          oldest

          votes










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55378835%2favro-data-is-not-converted-spark%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes




          Is this question similar to what you get asked at work? Learn more about asking and sharing private information with your coworkers using Stack Overflow for Teams.







          Is this question similar to what you get asked at work? Learn more about asking and sharing private information with your coworkers using Stack Overflow for Teams.



















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55378835%2favro-data-is-not-converted-spark%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

          Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

          Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript