Change Interface of NDArray & TBlob for DLPack Compatible #6345

Merged
merged 12 commits into from May 30, 2017

Conversation

Projects
None yet
4 participants
Contributor

ZihengJiang commented May 19, 2017

include/mxnet/ndarray.h
@@ -405,6 +402,16 @@ class NDArray {
}
};
+ void SetTBlob() {
@piiswrong

piiswrong May 19, 2017

Contributor

Do you still need this if you are setting it everytime in data()?

include/mxnet/tensor_blob.h
+ * \brief return the corresponding DLTensor
+ * \return the address of internal DLTensor
+ */
+ inline const DLTensor& GetDLTensor() {
@piiswrong

piiswrong May 19, 2017

Contributor

GetDLTensor -> dltensor

src/ndarray/ndarray.cc
@@ -614,8 +614,7 @@ NDArray &NDArray::operator/=(const real_t &src) {
}
void NDArray::Save(dmlc::Stream *strm) const {
- // save shape
- shape_.Save(strm);
@piiswrong

piiswrong May 19, 2017

Contributor

@tqchen Does ndarray dupm files have a version system? Can we store new arrays as int64 while supporting legacy read from int32?

@mli

mli May 19, 2017

Member

i don't think so. but we should add a version number

@ZihengJiang

ZihengJiang May 19, 2017

Contributor

Let us do it in another PR. For now, we just make sure it is backward compatible.

@tqchen

tqchen May 20, 2017

Contributor

There is a reserved field on the dump file format that can be used for version checking

@tqchen

tqchen May 20, 2017

Contributor

@ZihengJiang maybe consider to do a backward compatible version now by utilizing the hint?

@piiswrong

piiswrong May 20, 2017

Contributor

You can bump the version to 0.9.7 and store that in the reserved field

@ZihengJiang

ZihengJiang May 20, 2017

Contributor

ok, no problem

include/mxnet/tensor_blob.h
+
+ inline void SetDLTensor() {
+ dltensor_.data = dptr_;
+ dltensor_.ctx = DLContext{-1, static_cast<DLDeviceType>(dev_mask_)};
@tqchen

tqchen May 21, 2017

Contributor

we need to fix context to be the right one when we really want to use DLTensor, otherwise it won't be usable

@ZihengJiang

ZihengJiang May 22, 2017

Contributor

Then how to handle the legacy constructors without device id?

@tqchen

tqchen May 22, 2017

Contributor

@piiswrong any ideas? do you think if we can construct TBlob with dev_id by default?

@piiswrong

piiswrong May 22, 2017

Contributor

we can have a new argument defaults to -1
then modify all the places it was used in mxnet.

@tqchen

tqchen May 22, 2017

Contributor

to be clear, device_id equals -1 won't be able to be used in dlpack compatible functions

include/mxnet/tensor_blob.h
- stride_(shape[shape.ndim() - 1]),
- dev_mask_(dev_mask),
- type_flag_(type_flag) {
+ TBlob(void *dptr, const TShape &shape, int dev_mask, int dev_id, int type_flag)
@piiswrong

piiswrong May 23, 2017

Contributor

put dev_id at last and give it a default value -1

include/mxnet/ndarray.h
@@ -121,7 +121,7 @@ class NDArray {
raw_shape[0] = length;
MSHADOW_TYPE_SWITCH(dtype_, DType, {
res = TBlob(static_cast<DType*>(ptr_->shandle.dptr)
- + offset_ + offset, raw_shape, ptr_->shandle.ctx.dev_mask());
+ + offset_ + offset, raw_shape, ptr_->shandle.ctx);
@tqchen

tqchen May 23, 2017

Contributor

why two offset here? maybe consider combine into one?

@ZihengJiang

ZihengJiang May 24, 2017

Contributor

the original tblob::chunk has one offset, also raw_data can accept an offset as an argument

.gitmodules
@@ -9,7 +9,10 @@
url = https://github.com/dmlc/ps-lite
[submodule "nnvm"]
path = nnvm
- url = https://github.com/dmlc/nnvm
+ url = https://github.com/ZihengJiang/nnvm
@mli

mli May 24, 2017

Member

why use ZihengJiang/nnvm instead of dmlc/nnvm?

@ZihengJiang

ZihengJiang May 24, 2017

Contributor

As you can see here in the log of ci(http://ec2-52-25-96-65.us-west-2.compute.amazonaws.com/blue/organizations/jenkins/mxnet/detail/PR-6345/12/pipeline/220#step-293-log-131). The size of input_shape is right in the mxnet side but wrong in the nnvm side. So I'm trying to log it out in my nnvm branch for debuging. (There maybe something wrong in CI with updated submodule). But it looks like this method doesn't work due to CI cannot change to my branch.

ZihengJiang added some commits May 24, 2017

Change Interface of NDArray & TBlob for DLPack Compatible
Fix for cudnn operator

Fix cpp tests
src/ndarray/ndarray.cc
@@ -639,25 +643,35 @@ void NDArray::Save(dmlc::Stream *strm) const {
}
bool NDArray::Load(dmlc::Stream *strm) {
@piiswrong

piiswrong May 28, 2017

Contributor

change the interface directly

Contributor

piiswrong commented May 28, 2017

please add tests for legacy ndarray loading

ZihengJiang added some commits May 28, 2017

include/mxnet/tensor_blob.h
- TBlob(DType *dptr,
- const TShape &shape,
- int dev_mask)
+ TBlob(DType *dptr, const TShape &shape, int dev_mask, int dev_id)
@piiswrong

piiswrong May 29, 2017

Contributor

TBlob constructor interface need to be backward compatible. add -1 as default here

@piiswrong piiswrong merged commit 05e0728 into apache:master May 30, 2017

3 of 4 checks passed

continuous-integration/travis-ci/pr The Travis CI build could not complete due to an error
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/jenkins/pr-head This commit looks good
Details
mxnet-pr-r
Details

cjolivier01 added a commit to cjolivier01/mxnet that referenced this pull request May 30, 2017

Change Interface of NDArray & TBlob for DLPack Compatible (#6345)
* Change Interface of NDArray & TBlob for DLPack Compatible

Fix for cudnn operator

Fix cpp tests

* Update nnvm

* Fix for MKL mem

* Fix for windows macro

* Bump up version number to 0.10.1

* Update NDArray Save&Load

* trigger update

* Add test for legacy data load

* Use LegacyTShapeLoad

* trigger update

* Update tensor_blob.h

piiswrong added a commit that referenced this pull request Jun 5, 2017

batchnorm specify channel axis and performance optimizations for batc…
…hnorm (#6411)

* Add channel_axis to batch norm, performance improvements

* rearrange tests a bit

* rearrange tests a bit

* CR changes

* cpp package link issue

* Fix: MSVC wants all parallel omp to be int

* CR comments, expand legal negative axes

* lint

* lint

* Fix download link (#6431)

* Fix Download Button

* Small fix

* Add release note (#6434)

* Fixing tutorials. (#6436)

Most of the fixes should be self evident. For tutorial on pre-trained models, one of the images doesn't exist anymore so selected a new one. Long-term, we should put such images on web-data repo but alas, some other day.

For Handwritten digit tutorial, we are missing couple of imports in the test_utils.py that was recently created.

Note that: for pre-trained model tutorial, we get a softmax_label warning and the probability scores are not really probabilities. Will deal with that issue in another PR.

Testing:
I've tried to test all the notebooks with this change and things look fine.

* Formatting fixes (#6433)

* Formatting fixes

* lint fixed

* fix

* doc bash 2-5, for pack, unpack, pack_img and unpack_img (#6140)

* doc bash for pack, unpack, pack_img and unpack_img

* Add comments for labels could be 1d list

* Update recordio.py

* Update recordio.py

* Update recordio.py

fixing text

* Update recordio.py

fixing text

* remove empty line

* Improve style (#6445)

* Correction (#6444)

* CSVIter example correction

* fix

* Update documentation for MXNetDataIter in io.py (#6000) (#6113)

* Update documentation for MXNetDataIter in io.py (#6000)

* [DOC] Respond to feedback (#6113)

* Fix minor issues with api pages. (#6410)

1. In the notes section for ndarray, references did not seem clear enough to be referring to mxnet.ndarray or numpy.ndarray. Added the package names as prefixes to make it more obvious.
2. "share the same C++ operator source codes" => "share the same code". Since we don't really need to throw in more details than required.
3. Other relatively  minor language changes which will be obvious from the diff.

Note that I'm relatively not sure about the need for 1/ since it makes things more verbose. Let me know if it unnecessary and I'll remove it.

* fixing the early stop for maximize = T  (#5915)

close #4587

* Update documentation for mxnet.ndarray.GridGenerator. (#6430)

* Update documentation for mxnet.ndarray.GridGenerator.

Thanks @Lyken17 for #6147

* Fix lint error.

* Minor fix.

* Remove the example.

* Update documentation for deconvolution operation. (#6184)

* Update documentation for deconvolution operation.

* Add examples.

* Remove the example.

* skip lines that have %matplotlib (#6451)

* Fixing some more broken links before v0.10 release (#6449)

* close #4838 (#6452)

* Fix linear regression (#6432)

* Fix Linear Regression Tutorial

* Small fix

* Pre-trained model tutorial fixes. (#6453)

Before the change on running the tutorial for the first time: "UserWarning: Data provided by label_shapes don't match names specified by label_names ([] vs. ['softmax_label'])". It also showed probability of >>1 due to incorrect usage of np.argsort().

* Nightly test tutorial (#6447)

* Add tutorial test

* Fix pylint

* Small fix

* fix member variable name: make them end with underline (#6438)

* [R] captcha example (#6443)

* skip lines that have %matplotlib (#6459)

* Fix cudnn_deconv not guarding no_bias (#6456)

* Fixing up issues in install guide (#6463)

* Fixing copy code functionality for bash command (#6465)

* Residual unroll (#6397)

* residual unroll

* unroll for residual cell

* merge_outputs fix

* Linear regression Tutorial link (#6468)

* Fixing a link in the linear regression tutorial.

The link was initally going to mxnet-test.readthedocs.io. Changed it to mxnet.io/api.

* More appropriate language.

* bump up version number for release (#6462)

* bump up version number for release

* update version for scala/R/backend

* [R][DOC] update R installation guide (#6457)

* Use sphinx==1.3.5 in Dockerfile.doc (#6470)

changed PR name

* Add 0.10 release info to README.md and NEWS.md (#6471)

@nswamy wants to merge it immediately, so i'm going to do it now. I also changed the PR title.

* fix batchNorm cpp example (#6454)

* Update im2rec.py (#6473)

Updated Line 107 of 'im2rec.py'. Read an image as binary.

* Change Interface of  NDArray & TBlob for DLPack Compatible (#6345)

* Change Interface of NDArray & TBlob for DLPack Compatible

Fix for cudnn operator

Fix cpp tests

* Update nnvm

* Fix for MKL mem

* Fix for windows macro

* Bump up version number to 0.10.1

* Update NDArray Save&Load

* trigger update

* Add test for legacy data load

* Use LegacyTShapeLoad

* trigger update

* Update tensor_blob.h

* change 'channel_axis' parameter to 'axis'

* Change DEFAULT_CHANNEL_AXIS to DEFAULT_AXIS

* wait for dlpack PR to go through

* Trigger build

Guneet-Dhillon pushed a commit to Guneet-Dhillon/mxnet that referenced this pull request Sep 13, 2017

Change Interface of NDArray & TBlob for DLPack Compatible (#6345)
* Change Interface of NDArray & TBlob for DLPack Compatible

Fix for cudnn operator

Fix cpp tests

* Update nnvm

* Fix for MKL mem

* Fix for windows macro

* Bump up version number to 0.10.1

* Update NDArray Save&Load

* trigger update

* Add test for legacy data load

* Use LegacyTShapeLoad

* trigger update

* Update tensor_blob.h

Guneet-Dhillon pushed a commit to Guneet-Dhillon/mxnet that referenced this pull request Sep 13, 2017

batchnorm specify channel axis and performance optimizations for batc…
…hnorm (#6411)

* Add channel_axis to batch norm, performance improvements

* rearrange tests a bit

* rearrange tests a bit

* CR changes

* cpp package link issue

* Fix: MSVC wants all parallel omp to be int

* CR comments, expand legal negative axes

* lint

* lint

* Fix download link (#6431)

* Fix Download Button

* Small fix

* Add release note (#6434)

* Fixing tutorials. (#6436)

Most of the fixes should be self evident. For tutorial on pre-trained models, one of the images doesn't exist anymore so selected a new one. Long-term, we should put such images on web-data repo but alas, some other day.

For Handwritten digit tutorial, we are missing couple of imports in the test_utils.py that was recently created.

Note that: for pre-trained model tutorial, we get a softmax_label warning and the probability scores are not really probabilities. Will deal with that issue in another PR.

Testing:
I've tried to test all the notebooks with this change and things look fine.

* Formatting fixes (#6433)

* Formatting fixes

* lint fixed

* fix

* doc bash 2-5, for pack, unpack, pack_img and unpack_img (#6140)

* doc bash for pack, unpack, pack_img and unpack_img

* Add comments for labels could be 1d list

* Update recordio.py

* Update recordio.py

* Update recordio.py

fixing text

* Update recordio.py

fixing text

* remove empty line

* Improve style (#6445)

* Correction (#6444)

* CSVIter example correction

* fix

* Update documentation for MXNetDataIter in io.py (#6000) (#6113)

* Update documentation for MXNetDataIter in io.py (#6000)

* [DOC] Respond to feedback (#6113)

* Fix minor issues with api pages. (#6410)

1. In the notes section for ndarray, references did not seem clear enough to be referring to mxnet.ndarray or numpy.ndarray. Added the package names as prefixes to make it more obvious.
2. "share the same C++ operator source codes" => "share the same code". Since we don't really need to throw in more details than required.
3. Other relatively  minor language changes which will be obvious from the diff.

Note that I'm relatively not sure about the need for 1/ since it makes things more verbose. Let me know if it unnecessary and I'll remove it.

* fixing the early stop for maximize = T  (#5915)

close #4587

* Update documentation for mxnet.ndarray.GridGenerator. (#6430)

* Update documentation for mxnet.ndarray.GridGenerator.

Thanks @Lyken17 for apache#6147

* Fix lint error.

* Minor fix.

* Remove the example.

* Update documentation for deconvolution operation. (#6184)

* Update documentation for deconvolution operation.

* Add examples.

* Remove the example.

* skip lines that have %matplotlib (#6451)

* Fixing some more broken links before v0.10 release (#6449)

* close #4838 (#6452)

* Fix linear regression (#6432)

* Fix Linear Regression Tutorial

* Small fix

* Pre-trained model tutorial fixes. (#6453)

Before the change on running the tutorial for the first time: "UserWarning: Data provided by label_shapes don't match names specified by label_names ([] vs. ['softmax_label'])". It also showed probability of >>1 due to incorrect usage of np.argsort().

* Nightly test tutorial (#6447)

* Add tutorial test

* Fix pylint

* Small fix

* fix member variable name: make them end with underline (#6438)

* [R] captcha example (#6443)

* skip lines that have %matplotlib (#6459)

* Fix cudnn_deconv not guarding no_bias (#6456)

* Fixing up issues in install guide (#6463)

* Fixing copy code functionality for bash command (#6465)

* Residual unroll (#6397)

* residual unroll

* unroll for residual cell

* merge_outputs fix

* Linear regression Tutorial link (#6468)

* Fixing a link in the linear regression tutorial.

The link was initally going to mxnet-test.readthedocs.io. Changed it to mxnet.io/api.

* More appropriate language.

* bump up version number for release (#6462)

* bump up version number for release

* update version for scala/R/backend

* [R][DOC] update R installation guide (#6457)

* Use sphinx==1.3.5 in Dockerfile.doc (#6470)

changed PR name

* Add 0.10 release info to README.md and NEWS.md (#6471)

@nswamy wants to merge it immediately, so i'm going to do it now. I also changed the PR title.

* fix batchNorm cpp example (#6454)

* Update im2rec.py (#6473)

Updated Line 107 of 'im2rec.py'. Read an image as binary.

* Change Interface of  NDArray & TBlob for DLPack Compatible (#6345)

* Change Interface of NDArray & TBlob for DLPack Compatible

Fix for cudnn operator

Fix cpp tests

* Update nnvm

* Fix for MKL mem

* Fix for windows macro

* Bump up version number to 0.10.1

* Update NDArray Save&Load

* trigger update

* Add test for legacy data load

* Use LegacyTShapeLoad

* trigger update

* Update tensor_blob.h

* change 'channel_axis' parameter to 'axis'

* Change DEFAULT_CHANNEL_AXIS to DEFAULT_AXIS

* wait for dlpack PR to go through

* Trigger build

kice added a commit to kice/mxnet that referenced this pull request Dec 20, 2017

Change Interface of NDArray & TBlob for DLPack Compatible (#6345)
* Change Interface of NDArray & TBlob for DLPack Compatible

Fix for cudnn operator

Fix cpp tests

* Update nnvm

* Fix for MKL mem

* Fix for windows macro

* Bump up version number to 0.10.1

* Update NDArray Save&Load

* trigger update

* Add test for legacy data load

* Use LegacyTShapeLoad

* trigger update

* Update tensor_blob.h

kice added a commit to kice/mxnet that referenced this pull request Dec 20, 2017

batchnorm specify channel axis and performance optimizations for batc…
…hnorm (#6411)

* Add channel_axis to batch norm, performance improvements

* rearrange tests a bit

* rearrange tests a bit

* CR changes

* cpp package link issue

* Fix: MSVC wants all parallel omp to be int

* CR comments, expand legal negative axes

* lint

* lint

* Fix download link (#6431)

* Fix Download Button

* Small fix

* Add release note (#6434)

* Fixing tutorials. (#6436)

Most of the fixes should be self evident. For tutorial on pre-trained models, one of the images doesn't exist anymore so selected a new one. Long-term, we should put such images on web-data repo but alas, some other day.

For Handwritten digit tutorial, we are missing couple of imports in the test_utils.py that was recently created.

Note that: for pre-trained model tutorial, we get a softmax_label warning and the probability scores are not really probabilities. Will deal with that issue in another PR.

Testing:
I've tried to test all the notebooks with this change and things look fine.

* Formatting fixes (#6433)

* Formatting fixes

* lint fixed

* fix

* doc bash 2-5, for pack, unpack, pack_img and unpack_img (#6140)

* doc bash for pack, unpack, pack_img and unpack_img

* Add comments for labels could be 1d list

* Update recordio.py

* Update recordio.py

* Update recordio.py

fixing text

* Update recordio.py

fixing text

* remove empty line

* Improve style (#6445)

* Correction (#6444)

* CSVIter example correction

* fix

* Update documentation for MXNetDataIter in io.py (#6000) (#6113)

* Update documentation for MXNetDataIter in io.py (#6000)

* [DOC] Respond to feedback (#6113)

* Fix minor issues with api pages. (#6410)

1. In the notes section for ndarray, references did not seem clear enough to be referring to mxnet.ndarray or numpy.ndarray. Added the package names as prefixes to make it more obvious.
2. "share the same C++ operator source codes" => "share the same code". Since we don't really need to throw in more details than required.
3. Other relatively  minor language changes which will be obvious from the diff.

Note that I'm relatively not sure about the need for 1/ since it makes things more verbose. Let me know if it unnecessary and I'll remove it.

* fixing the early stop for maximize = T  (#5915)

close #4587

* Update documentation for mxnet.ndarray.GridGenerator. (#6430)

* Update documentation for mxnet.ndarray.GridGenerator.

Thanks @Lyken17 for apache#6147

* Fix lint error.

* Minor fix.

* Remove the example.

* Update documentation for deconvolution operation. (#6184)

* Update documentation for deconvolution operation.

* Add examples.

* Remove the example.

* skip lines that have %matplotlib (#6451)

* Fixing some more broken links before v0.10 release (#6449)

* close #4838 (#6452)

* Fix linear regression (#6432)

* Fix Linear Regression Tutorial

* Small fix

* Pre-trained model tutorial fixes. (#6453)

Before the change on running the tutorial for the first time: "UserWarning: Data provided by label_shapes don't match names specified by label_names ([] vs. ['softmax_label'])". It also showed probability of >>1 due to incorrect usage of np.argsort().

* Nightly test tutorial (#6447)

* Add tutorial test

* Fix pylint

* Small fix

* fix member variable name: make them end with underline (#6438)

* [R] captcha example (#6443)

* skip lines that have %matplotlib (#6459)

* Fix cudnn_deconv not guarding no_bias (#6456)

* Fixing up issues in install guide (#6463)

* Fixing copy code functionality for bash command (#6465)

* Residual unroll (#6397)

* residual unroll

* unroll for residual cell

* merge_outputs fix

* Linear regression Tutorial link (#6468)

* Fixing a link in the linear regression tutorial.

The link was initally going to mxnet-test.readthedocs.io. Changed it to mxnet.io/api.

* More appropriate language.

* bump up version number for release (#6462)

* bump up version number for release

* update version for scala/R/backend

* [R][DOC] update R installation guide (#6457)

* Use sphinx==1.3.5 in Dockerfile.doc (#6470)

changed PR name

* Add 0.10 release info to README.md and NEWS.md (#6471)

@nswamy wants to merge it immediately, so i'm going to do it now. I also changed the PR title.

* fix batchNorm cpp example (#6454)

* Update im2rec.py (#6473)

Updated Line 107 of 'im2rec.py'. Read an image as binary.

* Change Interface of  NDArray & TBlob for DLPack Compatible (#6345)

* Change Interface of NDArray & TBlob for DLPack Compatible

Fix for cudnn operator

Fix cpp tests

* Update nnvm

* Fix for MKL mem

* Fix for windows macro

* Bump up version number to 0.10.1

* Update NDArray Save&Load

* trigger update

* Add test for legacy data load

* Use LegacyTShapeLoad

* trigger update

* Update tensor_blob.h

* change 'channel_axis' parameter to 'axis'

* Change DEFAULT_CHANNEL_AXIS to DEFAULT_AXIS

* wait for dlpack PR to go through

* Trigger build
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment