Extracting data from .gz filesPython Extract data from filePython - Ask for input, extract data from fileextracting data from various lines from text file: PythonExtracting columnar data correctly as it is in the filePython script to extract data from text fileextracting data from specific lines in a fileextracting data from .txt file using pythonHow can I extract data from a file?Python: Reading and extracting data from multiples files and writing extracted data in multiple filesThe glob.glob function to extract data from files

Must Legal Documents Be Siged In Standard Pen Colors?

Terse Method to Swap Lowest for Highest?

Is it improper etiquette to ask your opponent what his/her rating is before the game?

Is this toilet slogan correct usage of the English language?

The IT department bottlenecks progress. How should I handle this?

Fear of getting stuck on one programming language / technology that is not used in my country

Why does the Sun have different day lengths, but not the gas giants?

Are the IPv6 address space and IPv4 address space completely disjoint?

Why Shazam when there is already Superman?

Removing files under particular conditions (number of files, file age)

Does an advisor owe his/her student anything? Will an advisor keep a PhD student only out of pity?

Longest common substring in linear time

Open a doc from terminal, but not by its name

Travelling outside the UK without a passport

Store Credit Card Information in Password Manager?

What are the purposes of autoencoders?

Can I sign legal documents with a smiley face?

What does routing an IP address mean?

How to bake one texture for one mesh with multiple textures blender 2.8

Is there a name for this algorithm to calculate the concentration of a mixture of two solutions containing the same solute?

Yosemite Fire Rings - What to Expect?

Loading commands from file

What was this official D&D 3.5e Lovecraft-flavored rulebook?

How to indicate a cut out for a product window



Extracting data from .gz files


Python Extract data from filePython - Ask for input, extract data from fileextracting data from various lines from text file: PythonExtracting columnar data correctly as it is in the filePython script to extract data from text fileextracting data from specific lines in a fileextracting data from .txt file using pythonHow can I extract data from a file?Python: Reading and extracting data from multiples files and writing extracted data in multiple filesThe glob.glob function to extract data from files













0















I am trying to retrieve specific columns of data from the past 7 days .gz files and split it into two separate csv files which i then want to get a line count of each file before removing the duplicates.



Currently my output files are empty, any help would be much appreciated.



#!/usr/bin/python

import gzip, os, csv, time, glob, shutil, smtplib, logging, os.path, datetime, argparse
from datetime import date, timedelta


strDate = datetime.datetime.strftime(datetime.datetime.now(), '%Y-%m-%d')
home = os.path.expanduser("~")
LOG_PATH = home + '/logs/'
LOG_FILENAME = LOG_PATH + strDate + '_scripts.log'

SOURCEDIR = home + '/archive/data/incoming/'
ARCHIVE = home + '/archive/data/Accounts/'
count = 7
limit = 0

if not os.path.isdir(ARCHIVE):
try:
os.mkdir(ARCHIVE)
logging.info('Directory ' + ARCHIVE + ' was missing but has been created.')
except:
logging.warning('Directory ' + ARCHIVE + ' is missing and can't be created. Exiting')
exit(1)


while ( count > limit ):
yesterday = date.today() - timedelta(count)
yesterday = yesterday.strftime('%Y%m%d')

if os.path.exists(SOURCEDIR+"spam_"+yesterday+".csv.gz"):
fin = gzip.open(SOURCEDIR+"spam_"+yesterday+".csv.gz",'rb')
reader = csv.reader(fin,delimiter = ',',quotechar="'")
fo = open(ARCHIVE+"Email_Accounts"+yesterday+".csv", 'ab')
fo2 = open(ARCHIVE+"Accounts"+yesterday+".csv", 'ab')
csvWriter = csv.writer(fo)
csvWriter2 = csv.writer(fo2)

try:
for row in reader:
SITE = row[2].strip()
SITE = SITE.rjust(2, '0')
ACCOUNT = row[1].strip()
ACCOUNT = ACCOUNT.rjust(9, '0')
EMAIL = row[3].strip()
DATA = (SITE+ACCOUNT+EMAIL)
EMAILData = (EMAIL)
ACCOUNTDATA = (SITE+ACCOUNT)
csvWriter.writerow(EMAILData)
csvWriter2.writerow(ACCOUNTDATA)

except IndexError:
pass
fo.close()
fo2.close()
fin.close()









share|improve this question




























    0















    I am trying to retrieve specific columns of data from the past 7 days .gz files and split it into two separate csv files which i then want to get a line count of each file before removing the duplicates.



    Currently my output files are empty, any help would be much appreciated.



    #!/usr/bin/python

    import gzip, os, csv, time, glob, shutil, smtplib, logging, os.path, datetime, argparse
    from datetime import date, timedelta


    strDate = datetime.datetime.strftime(datetime.datetime.now(), '%Y-%m-%d')
    home = os.path.expanduser("~")
    LOG_PATH = home + '/logs/'
    LOG_FILENAME = LOG_PATH + strDate + '_scripts.log'

    SOURCEDIR = home + '/archive/data/incoming/'
    ARCHIVE = home + '/archive/data/Accounts/'
    count = 7
    limit = 0

    if not os.path.isdir(ARCHIVE):
    try:
    os.mkdir(ARCHIVE)
    logging.info('Directory ' + ARCHIVE + ' was missing but has been created.')
    except:
    logging.warning('Directory ' + ARCHIVE + ' is missing and can't be created. Exiting')
    exit(1)


    while ( count > limit ):
    yesterday = date.today() - timedelta(count)
    yesterday = yesterday.strftime('%Y%m%d')

    if os.path.exists(SOURCEDIR+"spam_"+yesterday+".csv.gz"):
    fin = gzip.open(SOURCEDIR+"spam_"+yesterday+".csv.gz",'rb')
    reader = csv.reader(fin,delimiter = ',',quotechar="'")
    fo = open(ARCHIVE+"Email_Accounts"+yesterday+".csv", 'ab')
    fo2 = open(ARCHIVE+"Accounts"+yesterday+".csv", 'ab')
    csvWriter = csv.writer(fo)
    csvWriter2 = csv.writer(fo2)

    try:
    for row in reader:
    SITE = row[2].strip()
    SITE = SITE.rjust(2, '0')
    ACCOUNT = row[1].strip()
    ACCOUNT = ACCOUNT.rjust(9, '0')
    EMAIL = row[3].strip()
    DATA = (SITE+ACCOUNT+EMAIL)
    EMAILData = (EMAIL)
    ACCOUNTDATA = (SITE+ACCOUNT)
    csvWriter.writerow(EMAILData)
    csvWriter2.writerow(ACCOUNTDATA)

    except IndexError:
    pass
    fo.close()
    fo2.close()
    fin.close()









    share|improve this question


























      0












      0








      0








      I am trying to retrieve specific columns of data from the past 7 days .gz files and split it into two separate csv files which i then want to get a line count of each file before removing the duplicates.



      Currently my output files are empty, any help would be much appreciated.



      #!/usr/bin/python

      import gzip, os, csv, time, glob, shutil, smtplib, logging, os.path, datetime, argparse
      from datetime import date, timedelta


      strDate = datetime.datetime.strftime(datetime.datetime.now(), '%Y-%m-%d')
      home = os.path.expanduser("~")
      LOG_PATH = home + '/logs/'
      LOG_FILENAME = LOG_PATH + strDate + '_scripts.log'

      SOURCEDIR = home + '/archive/data/incoming/'
      ARCHIVE = home + '/archive/data/Accounts/'
      count = 7
      limit = 0

      if not os.path.isdir(ARCHIVE):
      try:
      os.mkdir(ARCHIVE)
      logging.info('Directory ' + ARCHIVE + ' was missing but has been created.')
      except:
      logging.warning('Directory ' + ARCHIVE + ' is missing and can't be created. Exiting')
      exit(1)


      while ( count > limit ):
      yesterday = date.today() - timedelta(count)
      yesterday = yesterday.strftime('%Y%m%d')

      if os.path.exists(SOURCEDIR+"spam_"+yesterday+".csv.gz"):
      fin = gzip.open(SOURCEDIR+"spam_"+yesterday+".csv.gz",'rb')
      reader = csv.reader(fin,delimiter = ',',quotechar="'")
      fo = open(ARCHIVE+"Email_Accounts"+yesterday+".csv", 'ab')
      fo2 = open(ARCHIVE+"Accounts"+yesterday+".csv", 'ab')
      csvWriter = csv.writer(fo)
      csvWriter2 = csv.writer(fo2)

      try:
      for row in reader:
      SITE = row[2].strip()
      SITE = SITE.rjust(2, '0')
      ACCOUNT = row[1].strip()
      ACCOUNT = ACCOUNT.rjust(9, '0')
      EMAIL = row[3].strip()
      DATA = (SITE+ACCOUNT+EMAIL)
      EMAILData = (EMAIL)
      ACCOUNTDATA = (SITE+ACCOUNT)
      csvWriter.writerow(EMAILData)
      csvWriter2.writerow(ACCOUNTDATA)

      except IndexError:
      pass
      fo.close()
      fo2.close()
      fin.close()









      share|improve this question
















      I am trying to retrieve specific columns of data from the past 7 days .gz files and split it into two separate csv files which i then want to get a line count of each file before removing the duplicates.



      Currently my output files are empty, any help would be much appreciated.



      #!/usr/bin/python

      import gzip, os, csv, time, glob, shutil, smtplib, logging, os.path, datetime, argparse
      from datetime import date, timedelta


      strDate = datetime.datetime.strftime(datetime.datetime.now(), '%Y-%m-%d')
      home = os.path.expanduser("~")
      LOG_PATH = home + '/logs/'
      LOG_FILENAME = LOG_PATH + strDate + '_scripts.log'

      SOURCEDIR = home + '/archive/data/incoming/'
      ARCHIVE = home + '/archive/data/Accounts/'
      count = 7
      limit = 0

      if not os.path.isdir(ARCHIVE):
      try:
      os.mkdir(ARCHIVE)
      logging.info('Directory ' + ARCHIVE + ' was missing but has been created.')
      except:
      logging.warning('Directory ' + ARCHIVE + ' is missing and can't be created. Exiting')
      exit(1)


      while ( count > limit ):
      yesterday = date.today() - timedelta(count)
      yesterday = yesterday.strftime('%Y%m%d')

      if os.path.exists(SOURCEDIR+"spam_"+yesterday+".csv.gz"):
      fin = gzip.open(SOURCEDIR+"spam_"+yesterday+".csv.gz",'rb')
      reader = csv.reader(fin,delimiter = ',',quotechar="'")
      fo = open(ARCHIVE+"Email_Accounts"+yesterday+".csv", 'ab')
      fo2 = open(ARCHIVE+"Accounts"+yesterday+".csv", 'ab')
      csvWriter = csv.writer(fo)
      csvWriter2 = csv.writer(fo2)

      try:
      for row in reader:
      SITE = row[2].strip()
      SITE = SITE.rjust(2, '0')
      ACCOUNT = row[1].strip()
      ACCOUNT = ACCOUNT.rjust(9, '0')
      EMAIL = row[3].strip()
      DATA = (SITE+ACCOUNT+EMAIL)
      EMAILData = (EMAIL)
      ACCOUNTDATA = (SITE+ACCOUNT)
      csvWriter.writerow(EMAILData)
      csvWriter2.writerow(ACCOUNTDATA)

      except IndexError:
      pass
      fo.close()
      fo2.close()
      fin.close()






      python-2.7






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 2 days ago







      P.4001

















      asked 2 days ago









      P.4001P.4001

      205




      205






















          0






          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55280833%2fextracting-data-from-gz-files%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55280833%2fextracting-data-from-gz-files%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

          Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

          Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript