https://github.com/dmlc/dgl
Revision 411bef5498bc79641858ea15ee79b6085c855187 authored by Tomasz Patejko on 11 June 2021, 09:09:30 UTC, committed by GitHub on 11 June 2021, 09:09:30 UTC
1 parent 17d604b
Raw File
Tip revision: 411bef5498bc79641858ea15ee79b6085c855187 authored by Tomasz Patejko on 11 June 2021, 09:09:30 UTC
[CPU, Parallel] parallel_for with default grain size (#3004)
Tip revision: 411bef5
migrate-guide-0.5.md
# Migration Guide for DGL 0.5

## Breaking changes

The following changes may break existing codes if the related APIs are used. Note that **most of the removed APIs have quite rare use cases** and have quite easy replacements.

1. DGLGraph now requires the graph structure and feature data to have the same device placement. If the given node/edge feature tensors have different devices as the graph’s, dgl.ndata and dgl.edata will raise an error as follow:
    ```bash
    dgl._ffi.base.DGLError: Cannot assign node feature "x" on device cpu to a graph on device cuda:0.
    Call DGLGraph.to() to copy the graph to the same device.
    ```
    To fix it, copy either the graph (using the `DGLGraph.to` API) or the feature tensors to the same device.

1. Changes to `dgl.graph`:
    * No longer accept SciPy matrix/NetworkX graph as the input data. Use `dgl.from_scipy`/`dgl.from_networkx` instead.
    * `ntype` and `etype` are removed from the arguments. To construct graphs with named node/edge types, use `dgl.heterograph`.
        ```python
        g = dgl.heterograph(('user', 'follows', 'user') : ...)
        ```
    * `validate` is removed from the arguments. DGL now always checks whether the num_nodes is greater than the largest node ID if specified.
1. `dgl.bipartite` is removed.
    * To create a uni-directional bipartite graph, use `dgl.heterograph`. E.g.,
        ```python
        g = dgl.hetrograph(('user', 'rates', 'movie'): ...)
        ```
    * To create a uni-directional bipartite graph from a SciPy matrix, use the new API `dgl.bipartite_from_scipy`.
    * To create a uni-directional bipartite graph from a NetworkX graph, use the new API `dgl.bipartite_from_networkx`.
1. Changes to `dgl.heterograph`:
    * No longer accept SciPy matrix/NetworkX graph as the input data. Use the `from_*` APIs to create graphs first and then pass their edges to the `dgl.heterograph` API. E.g.,
        ```python
        nx_g = ...  # some networkx graph
        spmat = ...  # some scipy matrix
        g1 = dgl.from_networkx(nx_g)
        g2 = dgl.bipartite_from_scipy(spmat)
        g = dgl.heterograph({('user', 'follows', 'user') : g1.edges(),
                             ('user', 'rates', 'movie') : g2.edges()})
        ```
1. `dgl.hetero_from_relations` is removed. Use `dgl.heterograph` instead.
1. From 0.5, subgraphs extracted via DGL APIs automatically inherits node and edge features from the parent graph. DGL also saves the original nodes/edge IDs in `subg.ndata[dgl.NID]` and `subg.edata[dgl.EID]` if nodes/edges are relabeled. This new behavior makes the following `DGLGraph` methods useless and we thus remove them:
    * `DGLGraph.parent`, `DGLGraph.parent_nid`, `DGLGraph.parent_eid`, `DGLGraph.map_to_subgraph_nid`, `DGLGraph.copy_from_parent`, `DGLGraph.copy_to_parent` and `DGLGraph.detach_parent`.
1. Other removed DGLGraph APIs:
    * `DGLGraph.from_networkx`. Use `dgl.from_networkx` to construct a DGLGraph from a NetworkX graph.
    * `DGLGraph.from_scipy_sparse_matrix`. Use `dgl.from_scipy` to construct a DGLGraph from a SciPy matrix.
    * `DGLGraph.register_apply_node_func` , `DGLGraph.register_apply_edge_func`, `DGLGraph.register_message_func` and `DGLGraph.register_reduce_func`. Please specify them directly as the arguments of the message passing APIs.
        ```python
        g = ...  # some graph
        # before 0.5
        g.register_message_func(mfunc)
        g.register_reduce_func(rfunc)
        g.update_all()
        
        # starting from 0.5
        g.update_all(mfunc, rfunc)
        ```
    * `DGLGraph.group_apply_edges`. To normalize edge weights within the neighborhood of each destination node, use `dgl.nn.edge_softmax`. To normalize edge weights within the neighborhood of each source node, use `dgl.reverse` first before the edge softmax.
    * `DGLGraph.send` and `DGLGraph.recv`. There are rarely any cases where send and recv must be invoked separately. Use `DGLGraph.send_and_recv` or `DGLGraph.update_all` for message passing.
    * `DGLGraph.multi_recv`, `DGLGraph.multi_pull`, `DGLGraph.multi_send_and_recv`. To perform message passing on a part  of the nodes and edges, use `dgl.node_subgraph` or `dgl.edge_subgraph` to extract the subset first and then call `DGLGraph.multi_update_all`.
    * `DGLGraph.clear`. Use `dgl.graph(([], []))`` to create a new empty graph.
    * `DGLGraph.subgraphs`. Use `DGLGraph.subgraph`.
    * `DGLGraph.batch_num_nodes` and `DGLGraph.batch_num_edges` are now functions that accept node/edge type as the only argument for getting batching information of a heterograph.
    * `DGLGraph.flatten`. To create a new graph without batching information, use `new_g = gl.graph(old_g.edges())``.
1. The reduce function `dgl.function.prod` is removed.
1. `dgl.add_self_loop` will NOT remove existing self loops automatically. It is recommanded to call `dgl.remove_self_loop` before invoking `dgl.add_self_loop`.



## Deprecations

Will not break old codes but will throw deprecation warning.

### Core APIs

1. Creating a graph using `dgl.DGLGraph(data)` is deprecated. Use `dgl.graph(data)`.
1. Deprecated `DGLGraph` methods:
    - `DGLGraph.to_networkx` -> `dgl.to_networkx`
    - `DGLGraph.readonly` and `DGLGraph.is_readonly`. Before 0.5, this flag is a hint for more efficient implementation. From 0.5, the efficiency issue has been resolved so they become useless. 
    - `DGLGraph.__len__` -> `DGLGraph.number_of_nodes`
    - `dgl.DGLGraph.__contains__` -> `DGLGraph.has_nodes`
    - `DGLGraph.add_node` -> `DGLGraph.add_nodes`
    - `DGLGraph.add_edge` -> `DGLGraph.add_edges`
    - `DGLGraph.has_node` -> `DGLGraph.has_nodes`
    - `DGLGraph.has_edge_between` -> `DGLGraph.has_edges_between` 
    - `DGLGraph.edge_id` -> `dgl.DGLGraph.edge_ids`.
    - `DGLGraph.in_degree` -> `dgl.DGLGraph.in_degrees`.
    - `DGLGraph.out_degree` -> `dgl.DGLGraph.out_degrees`.
1. `dgl.to_simple_graph` -> `dgl.to_simple`.
1. `dgl.to_homo` -> `dgl.to_homogeneous`.
1. `dgl.to_hetero` -> `dgl.to_heterogeneous`.
1. `dgl.as_heterograph` and `dgl.as_immutable_graph` are deprecated as `dgl.DGLGraph` and `dgl.DGLHeteroGraph` are now merged.
1. `dgl.batch_hetero` -> `dgl.batch`
1. `dgl.unbatch_hetero` -> `dgl.unbatch`
1. The `node_attrs` / `edge_attrs` arguments of `dgl.batch` are renamed to `ndata` / `edata`.
1. The arguments `share_ndata` and `share_edata` of `dgl.reverse` are renamed to `copy_ndata` and `copy_edata`.

### Dataset APIs

For all the current datsets, their class attributes such as `graph`, `feat`, etc. are deprecated. The recommended usage is to get them from each sample:
```python
# Before 0.5
dataset = dgl.data.CoraFull()
g = dataset.graph
feat = dataset.feat
...

# From 0.5
dataset = dgl.data.CoraFullDataset()  # in 0.5, all the classes have a "Dataset" in the name.
g = dataset[0]  # is directly a DGLGraph object
feat = g.ndata['feat']
...
```

**Other changes**
* ``dgl.data.SST`` is deprecated and replaced by ``dgl.data.SSTDataset``. The attribute ``trees`` is deprecated and replaced by ``__getitem__``. The attribute ``num_vocabs`` is deprecated and replaced by ``vocab_size``
back to top