Skip to content

Conversation

flakrimjusufi
Copy link
Contributor

Add content-type to GCS sink

JIRA Ticket: https://cdap.atlassian.net/browse/PLUGIN-498

}

if (contentType == null) {
contentType = DEFAULT_CONTENT_TYPE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't modify field in the validate method. Instead, introduce "getter" method do the defaulting logic in there. In the validate method, call the getter method for validation if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the updated PR.

private static final String RECORDS_UPDATED_METRIC = "records.updated";
public static final String AVRO_NAMED_OUTPUT = "avro.mo.config.namedOutput";
public static final String COMMON_NAMED_OUTPUT = "mapreduce.output.basename";
public static final String CONTENT_TYPE = "Content-Type";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefix the FS configuration key. Also, use "." notation. E.g. "io.cdap.gcs.batch.sink.content.type".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was changed to public static final String CONTENT_TYPE = "io.cdap.gcs.batch.sink.content.type"; in the updated PR.

@CuriousVini
Copy link
Member

@flakrimjusufi If multiple GCS sinks are being used in a single pipeline with different content type. What is the behavior? Any conflicts for content type?

Copy link
Member

@CuriousVini CuriousVini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure changes have been tested for cases:

  1. Macros
  2. Backwards compatibility
  3. Also please add unit tests to these and modify integration tests to support this fix

@CuriousVini
Copy link
Member

/gcbrun

@flakrimjusufi
Copy link
Contributor Author

Please make sure changes have been tested for cases:

  1. Macros
  2. Backwards compatibility
  3. Also please add unit tests to these and modify integration tests to support this fix
  1. We tested with Macro and everything was okay. The pipeline was able to run successfully.
  2. We tested the pipelines with older versions of plugin and the pipelines were able to run successfully. We also did some further tests in CDAP 6.1.4 and CDAP 6.2.3 and everything worked fine.
  3. Please check the updated PR.

@flakrimjusufi
Copy link
Contributor Author

@flakrimjusufi If multiple GCS sinks are being used in a single pipeline with different content type. What is the behavior? Any conflicts for content type?

We did some tests with multiple GCS sink with different content types and the behavior was as expected. There were no conflicts for content type. The pipelines were able to run successfully.

@CuriousVini
Copy link
Member

CuriousVini commented Dec 28, 2020

Please make sure changes have been tested for cases:

  1. Macros
  2. Backwards compatibility
  3. Also please add unit tests to these and modify integration tests to support this fix
  1. We tested with Macro and everything was okay. The pipeline was able to run successfully.
  2. We tested the pipelines with older versions of plugin and the pipelines were able to run successfully. We also did some further tests in CDAP 6.1.4 and CDAP 6.2.3 and everything worked fine.
  3. Please check the updated PR.
  1. Could you please add a link to updated integration test PR here?

public String getContentType() {
if (!Strings.isNullOrEmpty(contentType)) {
if (contentType.equals(CONTENT_TYPE_OTHER)) {
if (Strings.isNullOrEmpty(customContentType)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means if customContentType is a macro, we use default content type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same in here as well, if the value of customContentType is a macro, the value that was specified by the user is being used.

@bajram-adapt
Copy link

Integration tests: cdapio/cdap-integration-tests#1083

@flakrimjusufi
Copy link
Contributor Author

Please make sure changes have been tested for cases:

  1. Macros
  2. Backwards compatibility
  3. Also please add unit tests to these and modify integration tests to support this fix
  1. We tested with Macro and everything was okay. The pipeline was able to run successfully.
  2. We tested the pipelines with older versions of plugin and the pipelines were able to run successfully. We also did some further tests in CDAP 6.1.4 and CDAP 6.2.3 and everything worked fine.
  3. Please check the updated PR.
  1. Could you please add a link to updated integration test PR here?

}
}

if (containsMacro(NAME_CONTENT_TYPE)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for this check, by default if its a macro or not provided at configure time, it will be null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Updated.

failureCollector.addFailure(String.format(
"Valid content types for json are %s, %s", CONTENT_TYPE_APPLICATION_JSON,
CONTENT_TYPE_TEXT_PLAIN),
null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move remaining to above line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Updated.

switch (format) {
case FORMAT_AVRO:
if (!contentType.equalsIgnoreCase(CONTENT_TYPE_APPLICATION_AVRO)) {
failureCollector.addFailure(String.format("Valid content type for avro is %s",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add period to the end of all the validation error messages

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Updated.

Copy link
Member

@CuriousVini CuriousVini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few comments. Please squash the commits after fixing.

Also please file a jira for GCS Multi sink plugin

}

@Nullable
@Description("Gets the value of content type")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be a javadoc not annotation. The javadoc should include mapping of format -> content type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR updated.

@flakrimjusufi
Copy link
Contributor Author

a few comments. Please squash the commits after fixing.

Also please file a jira for GCS Multi sink plugin

The PR is updated according to the latest review. Commits are also squashed.

Here you have the JIRA tikcket for GCS Multi Sink plugin:
https://cdap.atlassian.net/browse/PLUGIN-553

@CuriousVini CuriousVini merged commit f9ec4a1 into data-integrations:develop Jan 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants