Search This Blog

Welcome To Datawarehouse World

Search the keywords from all files in UNIX

Get link
Facebook
X
Pinterest
Email
Other Apps

April 13, 2020

find . -iname "*.*" | xargs grep -i "KeyWord"

Get link
Facebook
X
Pinterest
Email
Other Apps

Comments

Hadoop Commands

May 11, 2020

hadoop archive The hadoop archive command creates a Hadoop archive, a file that contains other files. A Hadoop archive always has a *.har extension. hadoop classpath The hadoop classpath command prints the class path needed to access the Hadoop jar and the required libraries. hadoop daemonlog The hadoop daemonlog command gets and sets the log level for each daemon. hadoop distcp The hadoop distcp command is a tool used for large inter- and intra-cluster copying. hadoop fs The hadoop fs command runs a generic file system user client that interacts with the MapR file system. hadoop jar The hadoop jar command runs a program contained in a JAR file. Users can bundle their MapReduce code in a JAR file and execute it using this command. hadoop job The hadoop job command enables you to manage MapReduce jobs. hadoop mfs The hadoop mfs command displays directory inf...

Database Size : Calculation in Teradata

April 07, 2020

SELECT DatabaseNameI, PermSpace, SpoolSpace, TempSpace,CreateTimeStamp FROM DBC.DBase WHERE DatabaseNameI = '[databasename]'; Permanent Space: Permanent space is the maximum amount of space allocated to the user/database to hold data rows. Perm space is used for database object (permanent tables, indexes etc) creation and to hold their data. The amount of permanent space is divided among the number of AMPs.Whenever per AMP limit exceeds the allocated space of AMP, 'No more room in database' error message is generated. Spool Space: Spool space is the unused permanent space which is used by the system to keep the intermediate results of the SQL query. Users without spool space cannot execute any query. Data is active up to the current session only. Spool space is divided among the number of AMPs. Whenever per AMP limit exceeds the allocated space, the user will get a spool space error. Temporary Space: Temporary space is the unused permanent space whic...

The difference between SET and MULTISET tables is?

April 06, 2020

The difference between SET and MULTISET tables is? SET tables – SET tables did not allow duplicate values in the table. If not specified in the DDL of the table then Teradata will create table as default SET. A SET table force Teradata to check for the duplicate rows every time a new row is inserted or updated in the table. This is an overhead on the resource if we need to insert massive amount of rows. MULTISET tables – MULTISET tables allow duplicate values in table. Remember that SET table causes an additional overhead of checking for the duplicate records. So we need to follow few points to save Teradata from this additional overhead. · If you are using any GROUP BY or QUALIFY statement on the source table then it’s highly recommended to define target table as MULTISET. As GROUP BY and QUALIFY will remove the duplicate records from the source. · ...