Stop on Errors. This function stops the mapping if a nonfatal error occurs in the reader, writer, or transformation threads. Default is disabled.
Bad Files. It's a common term for anyone who is working with Informatica. “It's a Flat File that stores all the Rejected Records which are rejected by the Target” – As simple as that. The File always appends the data to the bad file for each load.
what are the error tables in Informatica and how we do error handling in Informatica?
- – PMERR_DATA. Stores data and metadata about a transformation row error and its corresponding source row.
- – PMERR_SESS. Stores metadata about the session.
- – PMERR_TRANS.
Session Logs Information: Whenever a session runs, the integration service logs the information about the tasks that it performs in a file called session log file. The integration service logs details like load summary, transformation statistics etc.
The main difference between normal and bulk load is, in normal load Informatica repository service create logs and in bulk load log is not being created. That is the reason bulk load loads the data fast and if anything goes wrong the data cannot be recovered.
Use the following guidelines to optimize the performance of an Aggregator transformation:
- Group by simple columns.
- Use sorted input.
- Use incremental aggregation.
- Filter data before you aggregate it.
- Limit port connections.
Source Bottlenecks
Performance bottlenecks can occur when the Integration Service reads from a source database. Slowness in reading data from the source leads to delay in filling enough data into DTM buffer. So the transformation and writer threads wait for data. This delay causes the entire session to run slower.Use filter transformation as early as possible inside the mapping. If the unwanted data can be discarded early in the mapping, it would increase the throughput. ' Use source qualifier to filter the data.
What is Persistent Cache? Lookups are cached by default in Informatica. This means that Informatica by default brings in the entire data of the lookup table from database server to Informatica Server as a part of lookup cache building activity during session run.
Pushdown optimization is a concept using which you can push the transformation logic on the source or target database side. When you use the SQL override, the session performance is enhanced, as processing the data at a database level is faster compared to processing the data in Informatica.
Pushdown optimization technique in informatica pushes the part or complete transformation logic to the source or target database. To preview the SQL statements and mapping logic that the integration service pushes to the source or target database, use the pushdown optimization viewer.
Busy Time : Percentage of the run time. It is (run time - idle time) / run time x 100. Thread Work Time : The percentage of time taken to process each transformation in a thread.
Set Partition Point, Add Number of Partition then Partition type. Pass-Through (Default) : All rows in a single partition: No data Distribution. Additional Stage area for better performance. Round-Robin.
?DTM Buffer Size: The DTM buffer size specifies the amount of buffer memory used when the DTM processes a session. Default Buffer Block Size: The buffer block size specifies the amount of buffer memory used to move a block of data from the source to the target.
Data Transformation Manager (DTM) process. The PowerCenter Integration Service process starts the Data Transformation Manager process to run a session. The DTM process is also known as the pmdtm process. The DTM process forms partition groups and distributes them to worker DTM processes running on nodes in the grid.
In case of Flat file, generally, sorted joiner is more effective than lookup, because sorted joiner uses join conditions and caches less rows. In case of database, lookup can be effective if the database can return sorted data fast and the amount of data is small, because lookup can create whole cache in memory.
The DTM buffer size value depends on the amount of memory available on the server hosting the PowerCenter Integration Service. However, an initial value might be 128 MB or 256 MB, entered as 128000000 or 256000000.
A cache is a memory area where informatica server holds the data to perform calculations during the session run. It creates cache files in the $PMcachedir as soon as the session finishes, server deletes the cache files from the directory.
We can configure sessions for pushdown optimization having any of the databases like Oracle, IBM DB2, Teradata, Microsoft SQL Server, Sybase ASE or Databases that use ODBC drivers. When we use native drivers, the Integration Service generates SQL statements using native database SQL.
Ans: The type of command task that allows the shell commands to run anywhere during the workflow is known as the standalone task.
Session Partitioning means "Splitting ETL dataload in multiple parallel pipelines threads". It will be helpful on RDBMS like Oracle but not so effective for Teradata or Netezza (auto parallel aware architectural conflict ). Different Type of Partitioning supported by Informatica.
Informatica Server commits rows based on the number of target rows and the key constraints on the target table is Target Commit. 3. A commit interval is an interval at which the Informatica Server commits data to targets during a session.