Jump to content

User-defined function: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
Tags: Mobile edit Mobile web edit
m Reverted edits by 2409:4055:295:3DC9:0:0:172:48A0 (talk) to last version by Frap
 
(39 intermediate revisions by 31 users not shown)
Line 1: Line 1:
A '''user-defined function''' ('''UDF''') is a [[function (programming)|function]] provided by the user of a programnt.
A '''user-defined function''' ('''UDF''') is a [[function (programming)|function]] provided by the user of a program or environment, in a context where the usual assumption is that functions are built into the program or environment. UDFs are usually written for the requirement of its creator.


==BASIC language==
==BASIC language==
In some old implementations of the [[BASIC]] programming language, user-defined functions are defined using the "DEF FN" syntax. More modern dialects of BASIC are influenced by the [[structured programming]] paradigm, where most or all of the code is written as user-defined functions or procedures, and the concept becomes practically redundant.
In some old implementations of the [[BASIC]] programming language, user-defined functions are defined using the "DEF FN" syntax. More modern dialects of BASIC are influenced by the [[structured programming]] paradigm, where most or all of the code is written as user-defined functions or procedures, and the concept becomes practically redundant.

==COBOL language==
In the [[COBOL]] programming language, a user-defined function is an entity that is defined by the user by specifying a FUNCTION-ID paragraph. A user-defined function must return a value by specifying the RETURNING phrase of the procedure division header and they are invoked using the function-identifier syntax. See the ISO/IEC 1989:2014 Programming Language COBOL standard for details.

As of May 2022, the IBM Enterprise COBOL for z/OS 6.4 ([[IBM COBOL]]) compiler contains support for user-defined functions.


==Databases==
==Databases==
In SQL databases, a UDF provides a mechanism for extending the functionality of the [[database server]] by adding a function that can be evaluated in SQL statements. The [[SQL:2003|SQL]] standard distinguishes between [[Scalar (computing)|scalar]] and table functions. A scalar function returns only a single value (or [[null (SQL)|NULL]]), whereas a table function returns a (relational) table comprising zero or more rows, each row with one or more columns.
In [[relational database management system]]s, a user-defined function provides a mechanism for extending the functionality of the [[database server]] by adding a function, that can be evaluated in standard query language (usually [[SQL]]) statements. The [[ISO/IEC 9075|SQL standard]] distinguishes between [[Scalar (computing)|scalar]] and table functions. A scalar function returns only a single value (or [[null (SQL)|NULL]]), whereas a table function returns a (relational) table comprising zero or more rows, each row with one or more columns.


User-defined functions in SQL are declared using the <code>CREATE FUNCTION</code> statement. For example, a function that converts Celsius to Fahrenheit might be declared like this:
User-defined functions in SQL are declared using the <code>CREATE FUNCTION</code> statement. For example, a user-defined function that converts Celsius to Fahrenheit (a temperature scale used in USA) might be declared like this:
<source lang="sql">
<syntaxhighlight lang="sql">
CREATE FUNCTION dbo.CtoF(Celsius FLOAT)
CREATE FUNCTION dbo.CtoF(Celsius FLOAT)
RETURNS FLOAT
RETURNS FLOAT
RETURN (Celsius * 1.8) + 32
RETURN (Celsius * 1.8) + 32
</syntaxhighlight>
</source>


Once created, a user-defined function may be used in [[expression (programming)|expressions]] in SQL statements. For example, it can be invoked where most other intrinsic functions are allowed. This also includes [[Select (SQL)|SELECT statements]], where the function can be used against data stored in tables in the database. Conceptually, the function is evaluated once per row in such usage. For example, assume a table named ELEMENTS, with a row for each known chemical element. The table has a column named BoilingPoint for the boiling point of that element, in Celsius. The query
Once created, a user-defined function may be used in [[expression (programming)|expressions]] in SQL statements. For example, it can be invoked where most other intrinsic functions are allowed. This also includes [[Select (SQL)|SELECT statements]], where the function can be used against data stored in tables in the database. Conceptually, the function is evaluated once per row in such usage. For example, assume a table named {{Code|Elements}}, with a row for each known chemical element. The table has a column named BoilingPoint for the boiling point of that element, in Celsius. The query


<source lang="sql">
<syntaxhighlight lang="sql">
SELECT Name, CtoF(BoilingPoint)
SELECT Name, CtoF(BoilingPoint)
FROM Elements
FROM Elements
</syntaxhighlight>
</source>


would retrieve the name and the boiling point from each row. It invokes the CtoF user-defined function as declared above in order to convert the value in the column to a value in Fahrenheit.
would retrieve the name and the boiling point from each row. It invokes the <code>CtoF</code> user-defined function as declared above in order to convert the value in the column to a value in Fahrenheit.


Each user-defined function carries certain properties or characteristics. The SQL standard defines the following properties:
Each user-defined function carries certain properties or characteristics. The SQL standard defines the following properties:
Line 28: Line 33:
*Specific name - a name for the function that is unique within the database. Note that the function name does not have to be unique, considering [[overloaded function]]s. Some SQL implementations require that function names are unique within a database, and overloaded functions are not allowed.
*Specific name - a name for the function that is unique within the database. Note that the function name does not have to be unique, considering [[overloaded function]]s. Some SQL implementations require that function names are unique within a database, and overloaded functions are not allowed.
* Determinism - specifies whether the function is deterministic or not. The determinism characteristic has an influence on the [[query optimizer]] when compiling a SQL statement.
* Determinism - specifies whether the function is deterministic or not. The determinism characteristic has an influence on the [[query optimizer]] when compiling a SQL statement.
*SQL-data access - tells the database management system whether the function contains no SQL statements (NO SQL), contains SQL statements but does not access any tables or views (CONTAINS SQL), reads data from tables or views (READS SQL DATA), or actually modifies data in the database (MODIFIES SQL DATA).
*SQL-data access - tells the database management system whether the function contains no SQL statements (NO SQL), contains SQL statements but does not access any tables or [[view (SQL)|view]]s (CONTAINS SQL), reads data from tables or views (READS SQL DATA), or actually modifies data in the database (MODIFIES SQL DATA).


User-defined functions should not be confused with [[stored procedure]]s. Stored procedures allow the user to group a set of SQL commands. A procedure can accept parameters and execute its SQL statements depending on those parameters. A procedure is not an expression and, thus, cannot be used like user-defined functions.
User-defined functions should not be confused with [[stored procedure]]s. Stored procedures allow the user to group a set of SQL commands. A procedure can accept parameters and execute its SQL statements depending on those parameters. A procedure is not an expression and, thus, cannot be used like user-defined functions.
Line 35: Line 40:


===SQL Server 2000===
===SQL Server 2000===
There are three types of UDF in [[Microsoft SQL Server]] 2000: [[scalar function]]s, inline table-valued functions, and multistatement table-valued functions.
{{Cleanup|date=March 2009}}
There are three types of UDF in [[Microsoft SQL Server]] 2000:
*Scalar functions.
*Inline table-valued functions.
*Multistatementtable-valued functions.


Scalar functions return a single data value (not a table) with RETURNS clause. Scalar functions can use all scalar data types, with exception of timestamp and user-defined data types.
Scalar functions return a single data value (not a table) with RETURNS clause. Scalar functions can use all scalar data types, with exception of timestamp and user-defined data types.
Inline table-valued functions return the result set of a single SELECT statement.
Inline table-valued functions return the [[result set]] of a single SELECT statement.
Multistatement table-valued functions return a table, which was built with many TRANSACT-SQL statements.
Multistatement table-valued functions return a table, which was built with many TRANSACT-SQL statements.


User-defined functions can be invoked from a query like built‑in functions such as OBJECT_ID, LEN, DATEDIFF, or can be executed through an EXECUTE statement like stored procedures.
User-defined functions can be invoked from a query like built‑in functions such as OBJECT_ID, LEN, DATEDIFF, or can be executed through an EXECUTE statement like stored procedures.


Performance Notes:
Performance Notes:
1. On Microsoft SQL Server 2000 a table-valued function which "wraps" a View may be much faster than the View itself. The following MyFunction is an example of a "function-wrapper" which runs faster than the underlying view MyView:
1. On Microsoft SQL Server 2000 a table-valued function which "wraps" a View may be much faster than the View itself. The following MyFunction is an example of a "function-wrapper" which runs faster than the underlying view MyView:


<source lang="tsql">
<syntaxhighlight lang="tsql">
CREATE FUNCTION MyFunction()
CREATE FUNCTION MyFunction()
RETURNS @Tbl TABLE
RETURNS @Tbl TABLE
Line 80: Line 81:
RETURN
RETURN
END
END
</syntaxhighlight>
</source>


2. On Microsoft SQL Server 2005 the result of the same code execution is the opposite: view is executed faster than the "function-wrapper".
2. On Microsoft SQL Server 2005 the result of the same code execution is the opposite: view is executed faster than the "function-wrapper".
Line 91: Line 92:
Errors in UDF cause UDF to abort which, in turn, aborts the statement that invoked the UDF.
Errors in UDF cause UDF to abort which, in turn, aborts the statement that invoked the UDF.


<source lang="tsql">
<syntaxhighlight lang="tsql">
CREATE FUNCTION CubicVolume
CREATE FUNCTION CubicVolume
-- Input dimensions in centimeters
-- Input dimensions in centimeters
Line 104: Line 105:
RETURN(@CubeLength * @CubeWidth * @CubeHeight)
RETURN(@CubeLength * @CubeWidth * @CubeHeight)
END
END
</syntaxhighlight>
</source>


Data type supported in Microsoft SQL Server 2000
Data type supported in Microsoft SQL Server 2000
Line 116: Line 117:
DROP FUNCTION to remove UDF
DROP FUNCTION to remove UDF


===Hive===
===Apache Hive===
[[Apache Hive|Hive]] defines, in addition to the regular user defined functions (UDF), also user defined aggregate functions (UDAF) and table-generating functions (UDTF).<ref>
[[Apache Hive]] defines, in addition to the regular user-defined functions (UDF), also user-defined aggregate functions (UDAF) and table-generating functions (UDTF).<ref>
{{cite web
{{cite web
|first= |last=
|first= |last=
|url=https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF
|url=https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF
|title=LanguageManual UDF - Apache Hive - Apache Software Foundation
|title=LanguageManual UDF - Apache Hive - Apache Software Foundation
|work=
|website=
|publisher=
|publishe=
|date=26 June 2015}}
|date=26 June 2015}}
</ref> Hive enables developers to create their own custom functions with Java.<ref>
</ref> Hive enables developers to create their own custom functions with Java.<ref>
Line 130: Line 131:
|url=https://cwiki.apache.org/confluence/display/Hive/HivePlugins
|url=https://cwiki.apache.org/confluence/display/Hive/HivePlugins
|title=HivePlugins - Apache Hive - Apache Software Foundation
|title=HivePlugins - Apache Hive - Apache Software Foundation
|work=
|website=
|publisher=
|publisher=
|date=26 June 2015}}
|date=26 June 2015}}
</ref>
</ref>

===Apache Doris===
Apache Doris, an open-source real-time analytical database, allows external users to contribute their own UDFs written in C++ to it.<ref>{{cite web |title=Apache Doris UDF |url=https://doris.apache.org/docs/dev/ecosystem/udf/contribute-udf?_highlight=udf |access-date=8 April 2023}}</ref>


==References==
==References==
Line 140: Line 144:
==External links==
==External links==
*[http://msdn2.microsoft.com/en-us/library/aa258261(SQL.80).aspx Microsoft SQL Server reference for CREATE FUNCTION]
*[http://msdn2.microsoft.com/en-us/library/aa258261(SQL.80).aspx Microsoft SQL Server reference for CREATE FUNCTION]
*Visual Basic 2010 Professinal/Express
*[http://dev.mysql.com/doc/refman/5.5/en/adding-functions.html MySQL manual section on UDFs]
*[http://dev.mysql.com/doc/refman/5.5/en/adding-functions.html MySQL manual section on UDFs]
*[http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/index.jsp?topic=%2Fcom.ibm.db2.luw.sql.ref.doc%2Fdoc%2Fr0000917.html DB2 CREATE FUNCTION statement]
*[http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/index.jsp?topic=%2Fcom.ibm.db2.luw.sql.ref.doc%2Fdoc%2Fr0000917.html DB2 CREATE FUNCTION statement]

Latest revision as of 10:58, 14 December 2023

A user-defined function (UDF) is a function provided by the user of a program or environment, in a context where the usual assumption is that functions are built into the program or environment. UDFs are usually written for the requirement of its creator.

BASIC language

[edit]

In some old implementations of the BASIC programming language, user-defined functions are defined using the "DEF FN" syntax. More modern dialects of BASIC are influenced by the structured programming paradigm, where most or all of the code is written as user-defined functions or procedures, and the concept becomes practically redundant.

COBOL language

[edit]

In the COBOL programming language, a user-defined function is an entity that is defined by the user by specifying a FUNCTION-ID paragraph. A user-defined function must return a value by specifying the RETURNING phrase of the procedure division header and they are invoked using the function-identifier syntax. See the ISO/IEC 1989:2014 Programming Language COBOL standard for details.

As of May 2022, the IBM Enterprise COBOL for z/OS 6.4 (IBM COBOL) compiler contains support for user-defined functions.

Databases

[edit]

In relational database management systems, a user-defined function provides a mechanism for extending the functionality of the database server by adding a function, that can be evaluated in standard query language (usually SQL) statements. The SQL standard distinguishes between scalar and table functions. A scalar function returns only a single value (or NULL), whereas a table function returns a (relational) table comprising zero or more rows, each row with one or more columns.

User-defined functions in SQL are declared using the CREATE FUNCTION statement. For example, a user-defined function that converts Celsius to Fahrenheit (a temperature scale used in USA) might be declared like this:

CREATE FUNCTION dbo.CtoF(Celsius FLOAT)
  RETURNS FLOAT
  RETURN (Celsius * 1.8) + 32

Once created, a user-defined function may be used in expressions in SQL statements. For example, it can be invoked where most other intrinsic functions are allowed. This also includes SELECT statements, where the function can be used against data stored in tables in the database. Conceptually, the function is evaluated once per row in such usage. For example, assume a table named Elements, with a row for each known chemical element. The table has a column named BoilingPoint for the boiling point of that element, in Celsius. The query

SELECT Name, CtoF(BoilingPoint)
  FROM Elements

would retrieve the name and the boiling point from each row. It invokes the CtoF user-defined function as declared above in order to convert the value in the column to a value in Fahrenheit.

Each user-defined function carries certain properties or characteristics. The SQL standard defines the following properties:

  • Language - defines the programming language in which the user-defined function is implemented; examples include SQL, C, C# and Java.
  • Parameter style - defines the conventions that are used to pass the function parameters and results between the implementation of the function and the database system (only applicable if language is not SQL).
  • Specific name - a name for the function that is unique within the database. Note that the function name does not have to be unique, considering overloaded functions. Some SQL implementations require that function names are unique within a database, and overloaded functions are not allowed.
  • Determinism - specifies whether the function is deterministic or not. The determinism characteristic has an influence on the query optimizer when compiling a SQL statement.
  • SQL-data access - tells the database management system whether the function contains no SQL statements (NO SQL), contains SQL statements but does not access any tables or views (CONTAINS SQL), reads data from tables or views (READS SQL DATA), or actually modifies data in the database (MODIFIES SQL DATA).

User-defined functions should not be confused with stored procedures. Stored procedures allow the user to group a set of SQL commands. A procedure can accept parameters and execute its SQL statements depending on those parameters. A procedure is not an expression and, thus, cannot be used like user-defined functions.

Some database management systems allow the creation of user defined functions in languages other than SQL. Microsoft SQL Server, for example, allows the user to use .NET languages including C# for this purpose. DB2 and Oracle support user-defined functions written in C or Java programming languages.

SQL Server 2000

[edit]

There are three types of UDF in Microsoft SQL Server 2000: scalar functions, inline table-valued functions, and multistatement table-valued functions.

Scalar functions return a single data value (not a table) with RETURNS clause. Scalar functions can use all scalar data types, with exception of timestamp and user-defined data types. Inline table-valued functions return the result set of a single SELECT statement. Multistatement table-valued functions return a table, which was built with many TRANSACT-SQL statements.

User-defined functions can be invoked from a query like built‑in functions such as OBJECT_ID, LEN, DATEDIFF, or can be executed through an EXECUTE statement like stored procedures.

Performance Notes: 1. On Microsoft SQL Server 2000 a table-valued function which "wraps" a View may be much faster than the View itself. The following MyFunction is an example of a "function-wrapper" which runs faster than the underlying view MyView:

CREATE FUNCTION MyFunction()
    RETURNS @Tbl TABLE 
    (
        StudentID              VARCHAR(255),
        SAS_StudentInstancesID INT,
        Label                  VARCHAR(255),
        Value                  MONEY,
        CMN_PersonsID          INT
    )
AS
BEGIN
    INSERT @Tbl
    (
        StudentID,
        SAS_StudentInstancesID,
        Label,
        Value,
        CMN_PersonsID
    )
    SELECT 
        StudentID,
        SAS_StudentInstancesID,
        Label,
        Value,
        CMN_PersonsID
    FROM MyView -- where MyView selects (with joins) the same columns from large table(s)

    RETURN
END

2. On Microsoft SQL Server 2005 the result of the same code execution is the opposite: view is executed faster than the "function-wrapper".

User-defined functions are subroutines made of one or more Transact-SQL statements that can be used to encapsulate code for reuse. It takes zero or more arguments and evaluates a return value. Has both control-flow and DML statements in its body similar to stored procedures. Does not allow changes to any Global Session State, like modifications to database or external resource, such as a file or network. Does not support output parameter. DEFAULT keyword must be specified to pass the default value of parameter. Errors in UDF cause UDF to abort which, in turn, aborts the statement that invoked the UDF.

CREATE FUNCTION CubicVolume
-- Input dimensions in centimeters
(
    @CubeLength decimal(4,1), 
    @CubeWidth  decimal(4,1),
    @CubeHeight decimal(4,1)
)
    RETURNS decimal(12,3)
AS
BEGIN
  RETURN(@CubeLength * @CubeWidth * @CubeHeight)
END

Data type supported in Microsoft SQL Server 2000 Like a temporary table used to store results Mostly used to define temporary variable of type (table) and the return value of a UDF The scope is limited to function, stored procedure, or batch in which it is defined Assignment operation is not allowed between (Table) variables May be used in SELECT, INSERT, UPDATE, and DELETE CREATE FUNCTION to create UDF ALTER FUNCTION to change the characteristics of UDF DROP FUNCTION to remove UDF

Apache Hive

[edit]

Apache Hive defines, in addition to the regular user-defined functions (UDF), also user-defined aggregate functions (UDAF) and table-generating functions (UDTF).[1] Hive enables developers to create their own custom functions with Java.[2]

Apache Doris

[edit]

Apache Doris, an open-source real-time analytical database, allows external users to contribute their own UDFs written in C++ to it.[3]

References

[edit]
  1. ^ "LanguageManual UDF - Apache Hive - Apache Software Foundation". 26 June 2015.
  2. ^ "HivePlugins - Apache Hive - Apache Software Foundation". 26 June 2015.
  3. ^ "Apache Doris UDF". Retrieved 8 April 2023.
[edit]