I'm trying to create a fairly simple custom transform component (because I've read that's the easiest one to create) which will take one column from a flat file source and based on the first row create the output columns.
I'm actually trying to write a component that will solve the now well known problem with parsing CSV files in SSIS. I have a lot of source files and all have many columns so a component that can read in the first line from the CSV file and create the output columns automatically will save me lots of time when migrating the old DTS packages.
I have the basic component set up but I'm stuck when trying to override the OnInputPathAttached method because I don't know how to use the inputID to get the first line from the input (the buffer).
Are there any good examples for creating output columns dynamically based on the input buffer?
Should I just give up on on the transform and create a custom source component instead?
Since there aren't any rows in the buffer until runtime, I don't see how this will work. Packages can't change their metadata (inputs / outputs) at runtime. You could write a source that uses the connection manager at design time to read the first line from the file, and add the output columns, but that would be by directly reading the flat file, not by using a row from the buffer.|||You could try something like this-
IDTSInput90 input = ComponentMetaData.InputCollection[inputID];
This is a design-time action, in the same way as you would "normally" use the flat file source to load a CSV file, and let the designer UI figure out the columns. This will not allow you to change the file layout at run-time, and magically load any file you happen to find. You area aware of this distinction?
From a design pattern perspective, this is not the place to be selecting and generating columns. It would be more sensible to do this either in a UI or ReinitializeMetadata. Validate could detect the stupid state of no input columns selected, and/or it not matching the input, and call RMD.
A source may be cleaner, or even a package generator. I am not clear on what you are really trying to do, and what problem you need to solve.
|||Thanks,
I'm trying to solve the problem related to a CSV source missing columns for some rows, it's been brought up a few times here in the past but I haven't seen a generic solution that is suitable for multiple DTS packages that are dependent on multiple CSV files all with 50+ columns.
Here's a forum entry on it:
http://forums.microsoft.com/TechNet/ShowPost.aspx?PostID=2025483&SiteID=17
And Jamie T's explanation with more links:
http://blogs.conchango.com/jamiethomson/archive/2007/05/15/SSIS_3A00_--Flat-File-Connection-Manager-issues.aspx
I'll see if I can get the custom source component working today.
|||I was able to get the source component working based off an example from Professional SQL Server 2005 Integration Services
http://www.wrox.com/WileyCDA/WroxTitle/productCd-0764584359.html
(The site has a page for downloading the examples).
The example for creating a source component had a couple errors in it (probably from being based off a pre-release version of SSIS).
Here's some of the key code:
Code Snippet
public override void AcquireConnections(object transaction)
{
if (ComponentMetaData.RuntimeConnectionCollection["File To Read"].ConnectionManager != null)
{
ConnectionManager cm = Microsoft.SqlServer.Dts.Runtime.DtsConvert.ToConnectionManager(ComponentMetaData.RuntimeConnectionCollection["File To Read"].ConnectionManager);
if (cm.CreationName != "FLATFILE")
{
throw new Exception("The Connection Manager is not a FILE Connection Manager");
}
else
{
_fileExist = (Microsoft.SqlServer.Dts.Runtime.DTSFileConnectionUsageType)cm.Properties["FileUsageType"].GetValue(cm);
if (_fileExist != Microsoft.SqlServer.Dts.Runtime.DTSFileConnectionUsageType.FileExists)
{
throw new Exception("The type of FILE connection manager must be an Existing File");
}
else
{
_filename = ComponentMetaData.RuntimeConnectionCollection["File To Read"].ConnectionManager.AcquireConnection(transaction).ToString();
if (_filename == null || _filename.Length == 0)
{
throw new Exception("Nothing returned when grabbing the filename");
}
}
}
}
}
The original example checked "if (cm.CreationName != "FILE")" which should actually be "if (cm.CreationName != "FLATFILE")"
Code Snippet
private void CreateOutputAndMetaDataColumns(IDTSOutput90 output)
{
if (_filename != null || _filename.Length > 0)
{
TextReader tr = File.OpenText(_filename);
string columns = tr.ReadLine();
tr.Close();
_columnNames = columns.Split(",".ToCharArray());
foreach (string columnName in _columnNames)
{
IDTSOutputColumn90 outName = output.OutputColumnCollection.New();
outName.Name = columnName.Trim();
outName.Description = columnName.Trim();
outName.SetDataTypeProperties(DataType.DT_STR, 50, 0, 0, 1252);
//Create an external metadata column to go alongside with it
CreateExternalMetaDataColumn(output.ExternalMetadataColumnCollection, outName);
}
}
}
Just to get the sample working all columns are strings for the moment, for my needs this is all I needed anyways.
Code Snippet
private bool DoesEachOutputColumnHaveAMetaDataColumnAndDoDatatypesMatch(int outputID)
{
IDTSOutput90 output = ComponentMetaData.OutputCollection.GetObjectByID(outputID);
IDTSExternalMetadataColumn90 mdc;
bool rtnVal = true;
int cCount = 0;
foreach (IDTSOutputColumn90 col in output.OutputColumnCollection)
{
if (col.ExternalMetadataColumnID == 0)
{
rtnVal = false;
}
else
{
//mdc = output.ExternalMetadataColumnCollection[col.ExternalMetadataColumnID];
mdc = output.ExternalMetadataColumnCollection[cCount];
if (mdc.DataType != col.DataType || mdc.Length != col.Length || mdc.Precision != col.Precision
|| mdc.Scale != col.Scale || mdc.CodePage != col.CodePage)
{
rtnVal = false;
}
cCount++;
}
}
return rtnVal;
}
This was the other change I needed to make to the example, the collection index doesn't match the column ID.
I'll try to post the full source code online if I get some time so that hopefully it saves someone else the trouble.
No comments:
Post a Comment