Blog coding and discussion of coding about JavaScript, PHP, CGI, general web building etc.

Thursday, March 10, 2016

Generate tree structure from csv

Generate tree structure from csv


I have scratched my head over this problem for a while now. I am basically trying to generate a tree hierarchy from a set of CSV data. The CSV data is not necessarily ordered. This is like something as follows:

Header: Record1,Record2,Value1,Value2  Row: A,XX,22,33  Row: A,XX,777,888  Row: A,YY,33,11  Row: B,XX,12,0  Row: A,YY,13,23  Row: B,YY,44,98  

I am trying to make the way the grouping is performed as flexible as possible. The simplest for of grouping would to do it for Record1 and Record2 with the Value1 and Value2 stored under Record2 so that we get the following output:

Record1      Record2          Value1 Value2  

Which would be:

A      XX          22,33          777,888      YY          33,11          13,23  B      XX          12,0      YY          44,98   

I am storing my group settings in a List at present - which I don't know if this is hindering my thoughts. This list contains a hierarchy of the groups for example:

Record1 (SchemaGroup)      .column = Record1      .columns = null      .childGroups =          Record2 (SchemaGroup)              .column = Record1              .columns = Value1 (CSVColumnInformation), Value2 (CSVColumnInformation)              .childGroups = null  

The code for this looks like as follows:

private class SchemaGroup {      private SchemaGroupType type = SchemaGroupType.StaticText;  // default to text      private String text;      private CSVColumnInformation column = null;      private List childGroups = new ArrayList();      private List columns = new ArrayList();  }      private enum SchemaGroupType {      /** Allow fixed text groups to be added */      StaticText,      /** Related to a column with common value */      ColumnGroup  }  

I am stuggling producing an algorithm for this, trying to think of the underlying structure to use. At present I am parsing the CSV top to bottom, using my own wrapper class:

CSVParser csv = new CSVParser(content);  String[] line;  while((line = csv.readLine()) != null ) {      ...  }  

I am just trying to kick start my coding brain.

Any thoughts?

Answer by lurker for Generate tree structure from csv


Based upon how this problem is posed, I would do the following:

  1. Define what your final data structure will look like to contain the tree.
  2. Define a representation for each row in your original text (perhaps a linked list for flexibility)
  3. Write a method that takes the represented row and inserts it into the tree data structure. For each non-existent branch, create it; for each existing branch, traverse it, as you step through your "row" link list structure.
  4. Start with an empty tree.
  5. Read each line of the file into your row item structure and call the method defined in step 3.

Does that help?

Answer by Jong Bor Lee for Generate tree structure from csv


The basic idea isn't difficult: group by the first record, then by the second record, etc. until you get something like this:

(A,XX,22,33)  (A,XX,777,888)  -------------------------  (A,YY,33,11)  (A,YY,13,23)  =============  (B,XX,12,0)  -------------------------  (B,YY,44,98)  

and then work backwards to build the trees.

However, there is a recursive component that makes it somewhat hard to reason about this problem, or show it step by step, so it's actually easier to write pseudocode.

I'll assume that every row in your csv is represented like a tuple. Each tuple has "records" and "values", using the same terms you use in your question. "Records" are the things that must be put into a hierarchic structure. "Values" will be the leaves of the tree. I'll use quotations when I use these terms with these specific meanings.

I also assume that all "records" come before all "values".

Without further ado, the code:

// builds tree and returns a list of root nodes  // list_of_tuples: a list of tuples read from your csv  // curr_position: used to keep track of recursive calls  // number_of_records: assuming each csv row has n records and then m values, number_of_records equals n  function build_tree(list_of_tuples, curr_position, number_of_records) {      // check if we have already reached the "values" (which shouldn't get converted into trees)      if (curr_position == number_of_records) {          return list of nodes, each containing a "value" (i.e. everything from position number_of_records on)      }        grouped = group tuples in list_of_tuples that have the same value in position curr_position, and store these groups indexed by such common value      unique_values = get unique values in curr_position        list_of_nodes = empty list       // create the nodes and (recursively) their children      for each val in unique_values {          the_node = create tree node containing val          the_children = build_tree(grouped[val], curr_position+1, number_of_records)          the_node.set_children(the_children)            list_of_nodes.append(the_node)      }        return list_of_nodes  }    // in your example, this returns a node with "A" and a node with "B"  // third parameter is 2 because you have 2 "records"  build_tree(list_parsed_from_csv, 0, 2)  

Now you'd have to think about the specific data structures to use, but hopefully this shouldn't be too difficult if you understand the algorithm (as you mention, I think deciding on a data structure early on may have been hindering your thoughts).

Answer by svick for Generate tree structure from csv


If you know you'll have just two levels of Records, I would use something like

Map>>  

When you read new line, you look into the outer map to check whether that value for Record1 already exists and if not, create new empty inner Map for it.

Then check the inner map whether a value for that Record2 exists. If not, create new List.

Then read the values and add them to the list.

Answer by Pangea for Generate tree structure from csv


Here is the basic working solution in the form of junit (no assertions though) simplified by using google-guava collections. The code is self-explanatory and instead of file io you use csv libraries for reading the csv. This should give you the basic idea.

import java.io.File;  import java.io.IOException;  import java.util.Collection;  import java.util.Collections;  import java.util.List;  import java.util.Set;    import org.junit.Test;    import com.google.common.base.Charsets;  import com.google.common.base.Splitter;  import com.google.common.collect.ArrayListMultimap;  import com.google.common.collect.Iterables;  import com.google.common.collect.Multimap;  import com.google.common.collect.Sets;  import com.google.common.io.Files;    public class MyTest  {      @Test      public void test1()      {          List rows = getAllDataRows();            Multimap table = indexData(rows);            printTree(table);        }        private void printTree(Multimap table)      {          Set alreadyPrintedRecord1s = Sets.newHashSet();            for (Records r : table.keySet())          {              if (!alreadyPrintedRecord1s.contains(r.r1))              {                  System.err.println(r.r1);                  alreadyPrintedRecord1s.add(r.r1);              }                System.err.println("\t" + r.r2);                Collection allValues = table.get(r);                for (Values v : allValues)              {                  System.err.println("\t\t" + v.v1 + " , " + v.v2);              }          }      }        private Multimap indexData(List lines)      {          Multimap table = ArrayListMultimap.create();            for (String row : lines)          {              Iterable split = Splitter.on(",").split(row);              String[] data = Iterables.toArray(split, String.class);                table.put(new Records(data[0], data[1]), new Values(data[2], data[3]));          }          return table;      }        private List getAllDataRows()      {          List lines = Collections.emptyList();            try          {              lines = Files.readLines(new File("C:/test.csv"), Charsets.US_ASCII);          }          catch (IOException e)          {              e.printStackTrace();          }            lines.remove(0);// remove header            return lines;      }  }        public class Records  {      public final String r1, r2;        public Records(final String r1, final String r2)      {          this.r1 = r1;          this.r2 = r2;      }        @Override      public int hashCode()      {          final int prime = 31;          int result = 1;          result = prime * result + ((r1 == null) ? 0 : r1.hashCode());          result = prime * result + ((r2 == null) ? 0 : r2.hashCode());          return result;      }        @Override      public boolean equals(final Object obj)      {          if (this == obj)          {              return true;          }          if (obj == null)          {              return false;          }          if (!(obj instanceof Records))          {              return false;          }          Records other = (Records) obj;          if (r1 == null)          {              if (other.r1 != null)              {                  return false;              }          }          else if (!r1.equals(other.r1))          {              return false;          }          if (r2 == null)          {              if (other.r2 != null)              {                  return false;              }          }          else if (!r2.equals(other.r2))          {              return false;          }          return true;      }        @Override      public String toString()      {          StringBuilder builder = new StringBuilder();          builder.append("Records1and2 [r1=").append(r1).append(", r2=").append(r2).append("]");          return builder.toString();      }    }      public class Values  {      public final String v1, v2;        public Values(final String v1, final String v2)      {          this.v1 = v1;          this.v2 = v2;      }        @Override      public int hashCode()      {          final int prime = 31;          int result = 1;          result = prime * result + ((v1 == null) ? 0 : v1.hashCode());          result = prime * result + ((v2 == null) ? 0 : v2.hashCode());          return result;      }        @Override      public boolean equals(final Object obj)      {          if (this == obj)          {              return true;          }          if (obj == null)          {              return false;          }          if (!(obj instanceof Values))          {              return false;          }          Values other = (Values) obj;          if (v1 == null)          {              if (other.v1 != null)              {                  return false;              }          }          else if (!v1.equals(other.v1))          {              return false;          }          if (v2 == null)          {              if (other.v2 != null)              {                  return false;              }          }          else if (!v2.equals(other.v2))          {              return false;          }          return true;      }        @Override      public String toString()      {          StringBuilder builder = new StringBuilder();          builder.append("Values1and2 [v1=").append(v1).append(", v2=").append(v2).append("]");          return builder.toString();      }    }  

Answer by Witt for Generate tree structure from csv


I recently had a need to do pretty much the same thing and wrote tree-builder.com to accomplish the task. The only difference is that as you have your CSV laid out, the last two parameters will be parent and child instead of peers. Also, my version doesn't accept a header row.

The code is all in JavaScript; it uses jstree to build the tree. You can use firebug or just view the source on the page to see how it's done. It would probably be pretty easy to tweak it to escape the comma in your CSV in order to keep the last two parameters is a single child.

Answer by Sansice for Generate tree structure from csv


    public static void main (String arg[]) throws Exception  {      ArrayList arRows = new ArrayList();      arRows.add("A,XX,22,33");      arRows.add("A,XX,777,888");      arRows.add("A,YY,33,11");      arRows.add("B,XX,12,0");      arRows.add("A,YY,13,23");      arRows.add("B,YY,44,98");      for(String sTreeRow:createTree(arRows,",")) //or use //// or whatever applicable          System.out.println(sTreeRow);  }      public static ArrayList createTree (ArrayList arRows, String sSeperator) throws Exception  {      ArrayList arReturnNodes = new ArrayList();      Collections.sort(arRows);      String sLastPath = "";      int iFolderLength = 0;      for(int iRow=0;iRow0)                  sTab = sTab+"    ";              if(!sLastPath.equals(sRow))              {                    if(sLastFolders!=null && sLastFolders.length>i)                  {                      if(!sLastFolders[i].equals(sFolders[i]))                      {                          arReturnNodes.add(sTab+sFolders[i]+"");                          sLastFolders = null;                      }                  }                  else                  {                      arReturnNodes.add(sTab+sFolders[i]+"");                  }              }          }          sLastPath = sRow;      }      return arReturnNodes;  }  


Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72

0 comments:

Post a Comment

Popular Posts

Powered by Blogger.