Library Element Data_importEdgesFromTable

This article provides information for the code based import of edges from an Excel or CSV table if the file contains edges of one class, only the source and target node identifying attribute values and edge attribute values, but not the exact attribute names as defined in the metamodel. The edge class name, the source and target node class names and the attribute names and the attribute mapping are defined as function arguments. This allows for the automatic creation of edges of one class with a user-defined mapping of property values.

The import is done using the Data_importEdgesFromTable library function. A general introduction to data import library functions can be found here.

Data structure

Use the library function Data_importEdgesFromTable to import a list of edges if your import data table is structured like this:

  • the first two columns of the data area contain the source and target node identifying attribute values. The other source/target node identifiers (class name and identifying attribute name) are defined as function arguments.
  • (optional) the third column and up of the data area contain attribute values of the edges, but the attribute names in the header row do not exactly match the attribute names in the metamodel: values of these columns can be manually mapped to any edge attribute, meaning the user can define the attribute mapping in the rule arguments.
  • the edge class name is not part of the table but all imported edges are of the same class (=typed, defined as a rule argument). Thus, you can only import edges of one edge class per import sequence.
source node attribute value target node attribute value Some Attr. 1 Some Attr. 2
100 1 C1
100 2 C2
1 3 C3

Exemplary edge data table containing identifying attribute values of source and target node (first and second column). The columns 3 and up contain the edge attribute values. The names in the header row do not need to match the attribute names defined in the metamodel.

If your data looks different please see this article to find the correct import function for your data scheme.

Library function scheme

When adding the library function ImportTypedFixedEdgeListAttrMapping from the Libraries pad to a rulesheet (*.grg file) the created code looks as follows. To execute the function, it needs to be included in a sequence as explained later.

Data_importEdgesFromTable( FILE_NAME, FIRST_ROW, FIRST_COLUMN, LAST_ROW, LAST_COLUMN, TABLE_NAME, SEPARATOR, ONLY_FIRST_EDGE, EDGE_CLASS, SRC_N_CLASS, SRC_ATTR, TRG_N_CLASS, TRG_ATTR, ATTR_MAP )

Arguments:

  • FILE_NAME – string type argument defining the (relative) path to the Excel or CSV file.
    FILE_NAME path can be relative to the projects path. If you added your data file DataFile.xlsx to the project folder Data the FILE_NAME will be “..\\..\\Data\\DataNodes.xlsx”.
  • FIRST_ROW, FIRST_COLUMN, LAST_ROW, LAST_COLUMN – int type argument defining the data area of the sheet to be imported.
    Rows and columns are denoted as integers. Rows and columns are denoted as integers. So to import the area from cell A1 to I4 the denotation would be 1,1,4,9. The row containing the column headers MUST NOT be included in the data area! Thus, import the complete sheet set LAST_ROW and LAST_COLUMN to -1, -1. The importer will automatically detect the last row and/or column containing data. Only one data area can be imported per import sequence.
  • TABLE_NAME – string type argument defining the exact name of the worksheet in an Excel file.
    For CSV files this must be an empty string (“”).
  • SEPARATOR – string type argument defining the separator used in a CSV file (e.g. “;” or “|”).
    For Excel files this must be an empty string (“”).
  • ONLY_FIRST_EDGE – boolean type argument defining if multiple source/target node matches should be considered or not.
    If set to false, the importer will create edges for all found pairs of source and target node candidates. If true an edge will be created only for the first found pair (which exactly is not determined). Scroll to the end of the article to find a detailed explanation.
  • EDGE_CLASS – string type argument defining the class of the edges to be created.
    The string must match a class name defined in the metamodel. All imported edges will be of this class.
  • SRC_N_CLASS- string type argument defining the class of the source nodes.
    The string must match a class name defined in the metamodel. All source nodes will be of this class.
  • SRC_ATTR- string type argument defining the identifying attribute of the source nodes.
    The string must match an attribute name defined in the metamodel. This will be the identifying attribute for all source nodes.
  • TRG_N_CLASS – string type argument defining the class of the target nodes.
    The string must match a class name defined in the metamodel. All target nodes will be of this class.
  • TRG_ATTR – string type argument defining the identifying attribute of the target nodes.
    The string must match an attribute name defined in the metamodel. This will be the identifying attribute for all target nodes.
  • ATTR_MAP– map of type int->string allowing the user to define an attribute mapping.
    The mapping follows a specific scheme: 3 -> “label”, 4 -> “quantity”, … meaning that the value in the third column of the selected data area is mapped to the attribute label, the value of the fourth column to the attribute quantity and so on. IMPORTANT: The mapping is defined in reference to the data area. Thus, if your imported data area starts in cell B2 the mapping 2 -> … relates to column C, 3 -> … to column D and so on. Please see the example below for more details.

Calling of library function in sequence

Assume that the edges data list to be imported looks like this, either as an Excel or as a CSV table:

A B C D E
1
2 src attr value trg attr value label quantity
3 100 1 C1 5
4 100 2 C2 2
5 1 3 C3 1

We import this list to an existing graph that contains the Product and Part nodes required as source and target nodes (Product with id 100, Part with id 1, Part with id 2, Part with id 3). Two examples of how to import this data using the library function in a sequence are given below. The sequence names can be freely defined by the user. The Excel and the CSV files are stored in the projects Data folder. For both files the data is imported from cell B2 to the last row and column containing data. The import data area (blue area) MUST NOT include the header row. The worksheet of the Excel file is named EdgesSheet. The separator in the CSV file is ; . All possible edges will be created, as ONLY_FIRST_EDGE is set to false and all created edges will be of the edge class Contains.

sequence importEdgesFromExcelFile {
   Data_importEdgesFromTable("..\\..\\Data\\DataEdges.xlsx", 3, 2, -1, -1, 
   "EdgesSheet", "", false, "Contains", "Product", "id", "Part", "id", map<int, string>{
   3->"label",
   4->"quantity"})
} 

sequence importEdgesFromCsvFile {
   Data_importEdgesFromTable("..\\..\\Data\\DataEdges.csv", 3, 2, -1, -1, 
   "", ";", false, "Contains", "Product", "id", "Part", "id", map<int, string>{
   3->"label",
   4->"quantity"})
} 

Note that the first two columns in the data area must follow the scheme described above. The headers of the table columns are not important. Given that the input data graph only contained the 4 nodes mentioned above, the result of this import could look like this:

Resulting graph file using the importtypedfixededgelist library element
Resulting graph file using the Data_importEdgesFromTable library element

As the sequence is fixed to import “Contains” edges from Product nodes to Part nodes and their respective identifying attributes, edges will only be built for the table rows which fit this scheme. Note that for the fifth row of the table above no edge is built as there is no Product with id 1 (this is a Part). To import this edge, we need an additional import sequence or change the class Product in the function arguments to a node superclass that contains Product and Part nodes.

Explanation for ONLY_FIRST_EDGE argument:

In order to create an edge, the importer has to find a pair of source and target nodes. It might be the case that the importer finds more that one pair of node candidates, for example if the identifying attribute is not unique. Imagine the following scenario:

Source node class: BlueNode
Source attribute: id
Source attribute value: 1
Target node class: RedNode
Target attribute: id
Target attribute value: 2

ONLY_FIRST_EDGE = false
Import Edge Soley

ONLY_FIRST_EDGE = true
Import Edge Soley

Was this article helpful?