Jump to content

Edit filter log

Details for log entry 34,971,321

01:20, 29 April 2023: Lindsey40186 (talk | contribs) triggered filter 550, performing the action "edit" on Cuneiform (programming language). Actions taken: Tag; Filter description: nowiki tags inserted into an article (examine | diff)

Changes made in edit

{{sxhl|lang=erlang|1=
{{sxhl|lang=erlang|1=
let r : <a : File, b : File> =
let r : <a : File, b : File> =
<a = f(), b = g()>;
<nowiki><a = f(), b = g()></nowiki>;
}}
}}


Action parameters

VariableValue
Edit count of the user (user_editcount)
4223
Name of the user account (user_name)
'Lindsey40186'
Age of the user account (user_age)
436253600
Groups (including implicit) the user is in (user_groups)
[ 0 => 'extendedconfirmed', 1 => '*', 2 => 'user', 3 => 'autoconfirmed' ]
Rights that the user has (user_rights)
[ 0 => 'extendedconfirmed', 1 => 'createaccount', 2 => 'read', 3 => 'edit', 4 => 'createtalk', 5 => 'writeapi', 6 => 'viewmywatchlist', 7 => 'editmywatchlist', 8 => 'viewmyprivateinfo', 9 => 'editmyprivateinfo', 10 => 'editmyoptions', 11 => 'abusefilter-log-detail', 12 => 'urlshortener-create-url', 13 => 'centralauth-merge', 14 => 'abusefilter-view', 15 => 'abusefilter-log', 16 => 'vipsscaler-test', 17 => 'collectionsaveasuserpage', 18 => 'reupload-own', 19 => 'move-rootuserpages', 20 => 'createpage', 21 => 'minoredit', 22 => 'editmyusercss', 23 => 'editmyuserjson', 24 => 'editmyuserjs', 25 => 'purge', 26 => 'sendemail', 27 => 'applychangetags', 28 => 'spamblacklistlog', 29 => 'mwoauthmanagemygrants', 30 => 'reupload', 31 => 'upload', 32 => 'move', 33 => 'autoconfirmed', 34 => 'editsemiprotected', 35 => 'skipcaptcha', 36 => 'ipinfo', 37 => 'ipinfo-view-basic', 38 => 'transcode-reset', 39 => 'transcode-status', 40 => 'createpagemainns', 41 => 'movestable', 42 => 'autoreview', 43 => 'enrollasmentor' ]
Whether the user is editing from mobile app (user_app)
false
Whether or not a user is editing through the mobile interface (user_mobile)
false
Page ID (page_id)
51797637
Page namespace (page_namespace)
0
Page title without namespace (page_title)
'Cuneiform (programming language)'
Full page title (page_prefixedtitle)
'Cuneiform (programming language)'
Edit protection level of the page (page_restrictions_edit)
[]
Page age in seconds (page_age)
207537335
Action (action)
'edit'
Edit summary/reason (summary)
'v2.05 - Fix errors for [[WP:WCW|CW project]] (HTML text style element <a>)'
Old content model (old_content_model)
'wikitext'
New content model (new_content_model)
'wikitext'
Old page wikitext, before the edit (old_wikitext)
'{{Short description|Open-source workflow language}} {{Infobox programming language | name = Cuneiform | logo = G18225.png | screenshot = Cf screenshot.jpg | caption = Screenshot of the Cuneiform editor and command line shell | paradigm = [[functional programming|functional]], [[Scientific workflow system|scientific workflow]] | designer = Jörgen Brandt | founder = | status = Active | latest release version = 3.0.4 | latest release date = {{release date|2018|11|19}} | latest preview version = | latest preview date = | typing = [[Static typing|static]], simple types | implementations = | dialects = | influenced_by = [[Swift (parallel scripting language)]] | influenced = | operating system = [[Linux]], [[Mac OS]] | programming language = [[Erlang (programming language)|Erlang]] | license = [[Apache License]] 2.0 | website = {{URL|https://cuneiform-lang.org/}} | file_ext = .cfl | year = 2013 }} '''Cuneiform''' is an [[open source software|open-source]] [[Scientific workflow system|workflow language]] for large-scale scientific data analysis.<ref>{{Cite web|url=https://github.com/joergen7/cuneiform|title = Joergen7/Cuneiform|website = [[GitHub]]|date = 14 October 2021}}</ref><ref>{{Cite journal | last1 = Brandt | first1 = Jörgen | last2 = Bux | first2 = Marc N. | last3 = Leser | first3 = Ulf | title = Cuneiform: A functional language for large scale scientific data analysis | journal = Proceedings of the Workshops of the EDBT/ICDT | volume = 1330 | pages = 17–26 | year = 2015 | url = http://ceur-ws.org/Vol-1330/paper-03.pdf }}</ref> It is a [[Type system#STATIC|statically typed]] [[Functional programming|functional programming language]] promoting [[parallel computing]]. It features a versatile [[foreign function interface]] allowing users to integrate software from many external programming languages. At the organizational level Cuneiform provides facilities like [[Conditional (computer programming)|conditional branching]] and [[Recursion|general recursion]] making it [[Turing completeness|Turing-complete]]. In this, Cuneiform is the attempt to close the gap between scientific workflow systems like [[Apache Taverna|Taverna]], [[KNIME]], or [[Galaxy (computational biology)|Galaxy]] and large-scale data analysis programming models like [[MapReduce]] or [[Pig (programming tool)|Pig Latin]] while offering the generality of a functional programming language. Cuneiform is implemented in distributed [[Erlang (programming language)|Erlang]]. If run in distributed mode it drives a [[POSIX]]-compliant distributed file system like [[Gluster]] or [[Ceph (software)#CephFS|Ceph]] (or a [[Filesystem in Userspace|FUSE]] integration of some other file system, e.g., [[Apache Hadoop#HDFS|HDFS]]). Alternatively, Cuneiform scripts can be executed on top of [[HTCondor]] or [[Apache Hadoop|Hadoop]].<ref>{{cite web|title=Scalable Multi-Language Data Analysis on Beam: The Cuneiform Experience by Jörgen Brandt|url=http://beta.erlangcentral.org/videos/scalable-multi-language-data-analysis-on-beam-the-cuneiform-experience-by-jorgen-brandt/#.WBLlE2hNzIU|website=Erlang Central|access-date=28 October 2016|archive-url=https://web.archive.org/web/20161002222350/http://beta.erlangcentral.org/videos/scalable-multi-language-data-analysis-on-beam-the-cuneiform-experience-by-jorgen-brandt/#.WBLlE2hNzIU|archive-date=2 October 2016|url-status=dead}}</ref><ref> {{Cite journal | last1 = Bux | first1 = Marc | last2 = Brandt | first2 = Jörgen | last3 = Lipka | first3 = Carsten | last4 = Hakimzadeh | first4 = Kamal | last5 = Dowling | first5 = Jim | last6 = Leser | first6 = Ulf | title = SAASFEE: scalable scientific workflow execution engine | journal = Proceedings of the VLDB Endowment | volume = 8 | number = 12 | pages = 1892–1895 | year = 2015 | url = http://www.vldb.org/pvldb/vol8/p1892-bux.pdf | doi = 10.14778/2824032.2824094 }}</ref><ref> {{Cite journal | last1 = Bessani | first1 = Alysson | last2 = Brandt | first2 = Jörgen | last3 = Bux | first3 = Marc | last4 = Cogo | first4 = Vinicius | last5 = Dimitrova | first5 = Lora | last6 = Dowling | first6 = Jim | last7 = Gholami | first7 = Ali | last8 = Hakimzadeh | first8 = Kamal | last9 = Hummel | first9 = Michael | last10 = Ismail | first10 = Mahmoud | last11 = Laure | first11 = Erwin | last12 = Leser | first12 = Ulf | last13 = Litton | first13 = Jan-Eric | last14 = Martinez | first14 = Roxanna | last15 = Niazi | first15 = Salman | last16 = Reichel | first16 = Jane | last17 = Zimmermann | first17 = Karin | title = Biobankcloud: a platform for the secure storage, sharing, and processing of large biomedical data sets | journal = The First International Workshop on Data Management and Analytics for Medicine and Healthcare (DMAH 2015) | volume = | number = | pages = | year = 2015 | url = http://www.di.fc.ul.pt/~bessani/publications/dmah15-bbc.pdf }} </ref><ref>{{cite web|title=Scalable Multi-Language Data Analysis on Beam: The Cuneiform Experience|url=http://www.erlang-factory.com/euc2016/jorgen-brandt|website=Erlang-factory.com|access-date=28 October 2016}}</ref> Cuneiform is influenced by the work of Peter Kelly who proposes functional programming as a model for scientific workflow execution.<ref>{{cite journal | last1 = Kelly | first1 = Peter M. | last2 = Coddington | first2 = Paul D. | last3 = Wendelborn | first3 = Andrew L. | year = 2009 | title = Lambda calculus as a workflow model | journal = Concurrency and Computation: Practice and Experience | volume = 21 | issue = 16 | pages = 1999–2017 | doi = 10.1002/cpe.1448| s2cid = 10833434 }}</ref><ref> {{cite journal | title = Workflows and extensions to the Kepler scientific workflow system to support environmental sensor data access and analysis | last1 = Barseghian | first1 = Derik | last2 = Altintas | first2 = Ilkay | last3 = Jones | first3 = Matthew B. | last4 = Crawl | first4 = Daniel | last5 = Potter | first5 = Nathan | last6 = Gallagher | first6 = James | last7 = Cornillon | first7 = Peter | last8 = Schildhauer | first8 = Mark | last9 = Borer | first9 = Elizabeth T. | last10 = Seabloom | first10 = Eric W. | journal = Ecological Informatics | volume = 5 | number = 1 | pages = 42–50 | year = 2010 | doi = 10.1016/j.ecoinf.2009.08.008 | s2cid = 16392118 | url = https://escholarship.org/content/qt2q46n1tp/qt2q46n1tp.pdf?t=nivnuu }} </ref> In this, Cuneiform is distinct from related workflow languages based on [[dataflow programming]] like [[Swift (parallel scripting language)|Swift]].<ref> {{cite journal | title = Nextflow enables reproducible computational workflows | last1 = Di Tommaso | first1 = Paolo | last2 = Chatzou | first2 = Maria | last3 = Floden | first3 = Evan W | last4 = Barja | first4 = Pablo Prieto | last5 = Palumbo | first5 = Emilio | last6 = Notredame | first6 = Cedric | journal = Nature Biotechnology | volume = 35 | number = 4 | pages = 316–319 | year = 2017 | doi = 10.1038/nbt.3820 | pmid = 28398311 | s2cid = 9690740 }} </ref> ==External software integration== External tools and libraries (e.g., [[R (programming language)|R]] or [[Python (programming language)|Python]] libraries) are integrated via a [[foreign function interface]]. In this it resembles, e.g., [[KNIME]] which allows the use of external software through snippet nodes, or [[Apache Taverna|Taverna]] which offers [[BeanShell]] services for integrating [[Java (programming language)|Java]] software. By defining a task in a foreign language it is possible to use the API of an external tool or library. This way, tools can be integrated directly without the need of writing a wrapper or reimplementing the tool.<ref>{{cite web|title=A Functional Workflow Language Implementation in Erlang|url=http://www.erlang-factory.com/static/upload/media/1448992381831050cuneiformberlinefl2015.pdf|access-date=28 October 2016}}</ref> Currently supported foreign programming languages are: {{div col}} * [[Bash (Unix shell)|Bash]] * [[Elixir (programming language)|Elixir]] * [[Erlang (programming language)|Erlang]] * [[Java (programming language)|Java]] * [[JavaScript]] * [[MATLAB]] * [[GNU Octave]] * [[Perl]] * [[Python (programming language)|Python]] * [[R (programming language)|R]] * [[Racket (programming language)|Racket]] {{div col end}} Foreign language support for [[AWK]] and [[gnuplot]] are planned additions. ==Type System== Cuneiform provides a simple, statically checked type system.<ref> {{ cite journal | title = Computation semantics of the functional scientific workflow language Cuneiform | last1 = Brandt | first1 = Jörgen | last2 = Reisig | first2 = Wolfgang | last3 = Leser | first3 = Ulf | journal = [[Journal of Functional Programming]] | volume = 27 | year = 2017 | doi = 10.1017/S0956796817000119 | s2cid = 6128299 }} </ref> While Cuneiform provides lists as [[compound data type]]s it omits traditional list accessors (head and tail) to avoid the possibility of runtime errors which might arise when accessing the empty list. Instead lists are accessed in an all-or-nothing fashion by only mapping or folding over them. Additionally, Cuneiform omits (at the organizational level) arithmetics which excludes the possibility of division by zero. The omission of any partially defined operation allows to guarantee that runtime errors can arise exclusively in foreign code. ===Base data types=== As base data types Cuneiform provides Booleans, strings, and files. Herein, files are used to exchange data in arbitrary format between foreign functions. ===Records and pattern matching=== Cuneiform provides [[Record_(computer_science)|record]]s (structs) as compound data types. The example below shows the definition of a variable <code>r</code> being a record with two fields <code>a1</code> and <code>a2</code>, the first being a string and the second being a Boolean. <syntaxhighlight lang="swift"> let r : <a1 : Str, a2 : Bool> = <a1 = "my string", a2 = true>; </syntaxhighlight> Records can be accessed either via projection or via [[pattern matching]]. The example below extracts the two fields <code>a1</code> and <code>a2</code> from the record <code>r</code>. <syntaxhighlight lang="swift"> let a1 : Str = ( r|a1 ); let <a2 = a2 : Bool> = r; </syntaxhighlight> ===Lists and list processing=== Furthermore, Cuneiform provides lists as compound data types. The example below shows the definition of a variable <code>xs</code> being a file list with three elements. <syntaxhighlight lang="erlang"> let xs : [File] = ['a.txt', 'b.txt', 'c.txt' : File]; </syntaxhighlight> Lists can be processed with the for and fold operators. Herein, the for operator can be given multiple lists to consume list element-wise (similar to <code>for/list</code> in [[Racket (programming language)|Racket]], <code>mapcar</code> in [[Common Lisp]] or <code>zipwith</code> in [[Erlang (programming language)|Erlang]]). The example below shows how to map over a single list, the result being a file list. <syntaxhighlight lang="ruby"> for x <- xs do process-one( arg1 = x ) : File end; </syntaxhighlight> The example below shows how to zip two lists the result also being a file list. <syntaxhighlight lang="ruby"> for x <- xs, y <- ys do process-two( arg1 = x, arg2 = y ) : File end; </syntaxhighlight> Finally, lists can be aggregated by using the fold operator. The following example sums up the elements of a list. <syntaxhighlight lang="text"> fold acc = 0, x <- xs do add( a = acc, b = x ) end; </syntaxhighlight> ==Parallel execution== Cuneiform is a purely functional language, i.e., it does not support [[Reference (computer science)|mutable references]]. In the consequence, it can use subterm-independence to divide a program into parallelizable portions. The Cuneiform scheduler distributes these portions to worker nodes. In addition, Cuneiform uses a [[Evaluation strategy#Call by name|Call-by-Name evaluation strategy]] to compute values only if they contribute to the computation result. Finally, foreign function applications are [[Memoization|memoized]] to speed up computations that contain previously derived results. For example, the following Cuneiform program allows the applications of <code>f</code> and <code>g</code> to run in parallel while <code>h</code> is dependent and can be started only when both <code>f</code> and <code>g</code> are finished. {{pre|1= let output-of-f : File = f(); let output-of-g : File = g(); h( f = output-of-f, g = output-of-g ); }} The following Cuneiform program creates three parallel applications of the function <code>f</code> by mapping <code>f</code> over a three-element list: {{pre|1= let xs : [File] = ['a.txt', 'b.txt', 'c.txt' : File]; for x <- xs do f( x = x ) : File end; }} Similarly, the applications of <code>f</code> and <code>g</code> are independent in the construction of the record <code>r</code> and can, thus, be run in parallel: {{sxhl|lang=erlang|1= let r : <a : File, b : File> = <a = f(), b = g()>; }} ==Examples== A hello-world script: <syntaxhighlight lang="ruby"> def greet( person : Str ) -> <out : Str> in Bash *{ out="Hello $person" }* ( greet( person = "world" )|out ); </syntaxhighlight> This script defines a task <code>greet</code> in [[Bash (Unix shell)|Bash]] which prepends <code>"Hello "</code> to its string argument <code>person</code>. The function produces a record with a single string field <code>out</code>. Applying <code>greet</code>, binding the argument <code>person</code> to the string <code>"world"</code> produces the record <code><out = "Hello world"></code>. Projecting this record to its field <code>out</code> evaluates the string <code>"Hello world"</code>. Command line tools can be integrated by defining a task in [[Bash (Unix shell)|Bash]]: <syntaxhighlight lang="ruby"> def samtoolsSort( bam : File ) -> <sorted : File> in Bash *{ sorted=sorted.bam samtools sort -m 2G $bam -o $sorted }* </syntaxhighlight> In this example a task <code>samtoolsSort</code> is defined. It calls the tool [[SAMtools]], consuming an input file, in BAM format, and producing a sorted output file, also in BAM format. ==Release history== {| class="wikitable" |- ! Version !! Appearance !! Implementation Language !! Distribution Platform !! Foreign Languages |- ! 1.0.0 | May 2014 | [[Java (programming language)|Java]] | [[Apache Hadoop]] | Bash, Common Lisp, GNU Octave, Perl, Python, R, Scala |- ! 2.0.x | Mar. 2015 | [[Java (programming language)|Java]] | [[HTCondor]], [[Apache Hadoop]] | Bash, BeanShell, Common Lisp, MATLAB, GNU Octave, Perl, Python, R, Scala |- ! 2.2.x | Apr. 2016 | [[Erlang (programming language)|Erlang]] | [[HTCondor]], [[Apache Hadoop]] | Bash, Perl, Python, R |- ! 3.0.x | Feb. 2018 | [[Erlang (programming language)|Erlang]] | Distributed Erlang | Bash, Erlang, Java, MATLAB, GNU Octave, Perl, Python, R, Racket |} In April 2016, Cuneiform's implementation language switched from [[Java (programming language)|Java]] to [[Erlang (programming language)|Erlang]] and, in February 2018, its major distributed execution platform changed from a Hadoop to distributed Erlang. Additionally, from 2015 to 2018 [[HTCondor]] had been maintained as an alternative execution platform. Cuneiform's surface syntax was revised twice, as reflected in the major version number. ===Version 1=== In its first draft published in May 2014, Cuneiform was closely related to [[Make (software)|Make]] in that it constructed a static data dependency graph which the interpreter traversed during execution. The major difference to later versions was the lack of conditionals, recursion, or static type checking. Files were distinguished from strings by juxtaposing single-quoted string values with a tilde <code>~</code>. The script's query expression was introduced with the <code>target</code> keyword. Bash was the default foreign language. Function application had to be performed using an <code>apply</code> form that took <code>task</code> as its first keyword argument. One year later, this surface syntax was replaced by a streamlined but similar version. The following example script downloads a reference genome from an FTP server. <pre> declare download-ref-genome; deftask download-fa( fa : ~path ~id ) *{ wget $path/$id.fa.gz gunzip $id.fa.gz mv $id.fa $fa }* ref-genome-path = ~'ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes'; ref-genome-id = ~'chr22'; ref-genome = apply( task : download-fa path : ref-genome-path id : ref-genome-id ); target ref-genome; </pre> ===Version 2=== [[File:Cf screenshot.jpg|thumb|Swing-based editor and REPL for Cuneiform 2.0.3]] The second draft of the Cuneiform surface syntax, first published in March 2015, remained in use for three years outlasting the transition from Java to Erlang as Cuneiform's implementation language. Evaluation differs from earlier approaches in that the interpreter reduces a query expression instead of traversing a static graph. During the time the surface syntax remained in use the interpreter was formalized and simplified which resulted in a first specification of Cuneiform's semantics. The syntax featured conditionals. However, Booleans were encoded as lists, recycling the empty list as Boolean false and the non-empty list as Boolean true. Recursion was added later as a byproduct of formalization. However, static type checking was introduced only in Version 3. The following script decompresses a zipped file and splits it into evenly sized partitions. <pre> deftask unzip( <out( File )> : zip( File ) ) in bash *{ unzip -d dir $zip out=`ls dir | awk '{print "dir/" $0}'` }* deftask split( <out( File )> : file( File ) ) in bash *{ split -l 1024 $file txt out=txt* }* sotu = "sotu/stateoftheunion1790-2014.txt.zip"; fileLst = split( file: unzip( zip: sotu ) ); fileLst; </pre> ===Version 3=== The current version of Cuneiform's surface syntax, in comparison to earlier drafts, is an attempt to close the gap to mainstream functional programming languages. It features a simple, statically checked typesystem and introduces records in addition to lists as a second type of compound data structure. Booleans are a separate base data type. The following script untars a file resulting in a file list. <pre> def untar( tar : File ) -> <fileLst : [File]> in Bash *{ tar xf $tar fileLst=`tar tf $tar` }* let hg38Tar : File = 'hg38/hg38.tar'; let <fileLst = faLst : [File]> = untar( tar = hg38Tar ); faLst; </pre> ==References== {{Reflist|30em}} [[Category:Programming languages]] [[Category:Workflow languages]] [[Category:Functional languages]] [[Category:Scripting languages]] [[Category:Linux programming tools]] [[Category:Hadoop]] [[Category:Statically typed programming languages]] [[Category:Cross-platform free software]]'
New page wikitext, after the edit (new_wikitext)
'{{Short description|Open-source workflow language}} {{Infobox programming language | name = Cuneiform | logo = G18225.png | screenshot = Cf screenshot.jpg | caption = Screenshot of the Cuneiform editor and command line shell | paradigm = [[functional programming|functional]], [[Scientific workflow system|scientific workflow]] | designer = Jörgen Brandt | founder = | status = Active | latest release version = 3.0.4 | latest release date = {{release date|2018|11|19}} | latest preview version = | latest preview date = | typing = [[Static typing|static]], simple types | implementations = | dialects = | influenced_by = [[Swift (parallel scripting language)]] | influenced = | operating system = [[Linux]], [[Mac OS]] | programming language = [[Erlang (programming language)|Erlang]] | license = [[Apache License]] 2.0 | website = {{URL|https://cuneiform-lang.org/}} | file_ext = .cfl | year = 2013 }} '''Cuneiform''' is an [[open source software|open-source]] [[Scientific workflow system|workflow language]] for large-scale scientific data analysis.<ref>{{Cite web|url=https://github.com/joergen7/cuneiform|title = Joergen7/Cuneiform|website = [[GitHub]]|date = 14 October 2021}}</ref><ref>{{Cite journal | last1 = Brandt | first1 = Jörgen | last2 = Bux | first2 = Marc N. | last3 = Leser | first3 = Ulf | title = Cuneiform: A functional language for large scale scientific data analysis | journal = Proceedings of the Workshops of the EDBT/ICDT | volume = 1330 | pages = 17–26 | year = 2015 | url = http://ceur-ws.org/Vol-1330/paper-03.pdf }}</ref> It is a [[Type system#STATIC|statically typed]] [[Functional programming|functional programming language]] promoting [[parallel computing]]. It features a versatile [[foreign function interface]] allowing users to integrate software from many external programming languages. At the organizational level Cuneiform provides facilities like [[Conditional (computer programming)|conditional branching]] and [[Recursion|general recursion]] making it [[Turing completeness|Turing-complete]]. In this, Cuneiform is the attempt to close the gap between scientific workflow systems like [[Apache Taverna|Taverna]], [[KNIME]], or [[Galaxy (computational biology)|Galaxy]] and large-scale data analysis programming models like [[MapReduce]] or [[Pig (programming tool)|Pig Latin]] while offering the generality of a functional programming language. Cuneiform is implemented in distributed [[Erlang (programming language)|Erlang]]. If run in distributed mode it drives a [[POSIX]]-compliant distributed file system like [[Gluster]] or [[Ceph (software)#CephFS|Ceph]] (or a [[Filesystem in Userspace|FUSE]] integration of some other file system, e.g., [[Apache Hadoop#HDFS|HDFS]]). Alternatively, Cuneiform scripts can be executed on top of [[HTCondor]] or [[Apache Hadoop|Hadoop]].<ref>{{cite web|title=Scalable Multi-Language Data Analysis on Beam: The Cuneiform Experience by Jörgen Brandt|url=http://beta.erlangcentral.org/videos/scalable-multi-language-data-analysis-on-beam-the-cuneiform-experience-by-jorgen-brandt/#.WBLlE2hNzIU|website=Erlang Central|access-date=28 October 2016|archive-url=https://web.archive.org/web/20161002222350/http://beta.erlangcentral.org/videos/scalable-multi-language-data-analysis-on-beam-the-cuneiform-experience-by-jorgen-brandt/#.WBLlE2hNzIU|archive-date=2 October 2016|url-status=dead}}</ref><ref> {{Cite journal | last1 = Bux | first1 = Marc | last2 = Brandt | first2 = Jörgen | last3 = Lipka | first3 = Carsten | last4 = Hakimzadeh | first4 = Kamal | last5 = Dowling | first5 = Jim | last6 = Leser | first6 = Ulf | title = SAASFEE: scalable scientific workflow execution engine | journal = Proceedings of the VLDB Endowment | volume = 8 | number = 12 | pages = 1892–1895 | year = 2015 | url = http://www.vldb.org/pvldb/vol8/p1892-bux.pdf | doi = 10.14778/2824032.2824094 }}</ref><ref> {{Cite journal | last1 = Bessani | first1 = Alysson | last2 = Brandt | first2 = Jörgen | last3 = Bux | first3 = Marc | last4 = Cogo | first4 = Vinicius | last5 = Dimitrova | first5 = Lora | last6 = Dowling | first6 = Jim | last7 = Gholami | first7 = Ali | last8 = Hakimzadeh | first8 = Kamal | last9 = Hummel | first9 = Michael | last10 = Ismail | first10 = Mahmoud | last11 = Laure | first11 = Erwin | last12 = Leser | first12 = Ulf | last13 = Litton | first13 = Jan-Eric | last14 = Martinez | first14 = Roxanna | last15 = Niazi | first15 = Salman | last16 = Reichel | first16 = Jane | last17 = Zimmermann | first17 = Karin | title = Biobankcloud: a platform for the secure storage, sharing, and processing of large biomedical data sets | journal = The First International Workshop on Data Management and Analytics for Medicine and Healthcare (DMAH 2015) | volume = | number = | pages = | year = 2015 | url = http://www.di.fc.ul.pt/~bessani/publications/dmah15-bbc.pdf }} </ref><ref>{{cite web|title=Scalable Multi-Language Data Analysis on Beam: The Cuneiform Experience|url=http://www.erlang-factory.com/euc2016/jorgen-brandt|website=Erlang-factory.com|access-date=28 October 2016}}</ref> Cuneiform is influenced by the work of Peter Kelly who proposes functional programming as a model for scientific workflow execution.<ref>{{cite journal | last1 = Kelly | first1 = Peter M. | last2 = Coddington | first2 = Paul D. | last3 = Wendelborn | first3 = Andrew L. | year = 2009 | title = Lambda calculus as a workflow model | journal = Concurrency and Computation: Practice and Experience | volume = 21 | issue = 16 | pages = 1999–2017 | doi = 10.1002/cpe.1448| s2cid = 10833434 }}</ref><ref> {{cite journal | title = Workflows and extensions to the Kepler scientific workflow system to support environmental sensor data access and analysis | last1 = Barseghian | first1 = Derik | last2 = Altintas | first2 = Ilkay | last3 = Jones | first3 = Matthew B. | last4 = Crawl | first4 = Daniel | last5 = Potter | first5 = Nathan | last6 = Gallagher | first6 = James | last7 = Cornillon | first7 = Peter | last8 = Schildhauer | first8 = Mark | last9 = Borer | first9 = Elizabeth T. | last10 = Seabloom | first10 = Eric W. | journal = Ecological Informatics | volume = 5 | number = 1 | pages = 42–50 | year = 2010 | doi = 10.1016/j.ecoinf.2009.08.008 | s2cid = 16392118 | url = https://escholarship.org/content/qt2q46n1tp/qt2q46n1tp.pdf?t=nivnuu }} </ref> In this, Cuneiform is distinct from related workflow languages based on [[dataflow programming]] like [[Swift (parallel scripting language)|Swift]].<ref> {{cite journal | title = Nextflow enables reproducible computational workflows | last1 = Di Tommaso | first1 = Paolo | last2 = Chatzou | first2 = Maria | last3 = Floden | first3 = Evan W | last4 = Barja | first4 = Pablo Prieto | last5 = Palumbo | first5 = Emilio | last6 = Notredame | first6 = Cedric | journal = Nature Biotechnology | volume = 35 | number = 4 | pages = 316–319 | year = 2017 | doi = 10.1038/nbt.3820 | pmid = 28398311 | s2cid = 9690740 }} </ref> ==External software integration== External tools and libraries (e.g., [[R (programming language)|R]] or [[Python (programming language)|Python]] libraries) are integrated via a [[foreign function interface]]. In this it resembles, e.g., [[KNIME]] which allows the use of external software through snippet nodes, or [[Apache Taverna|Taverna]] which offers [[BeanShell]] services for integrating [[Java (programming language)|Java]] software. By defining a task in a foreign language it is possible to use the API of an external tool or library. This way, tools can be integrated directly without the need of writing a wrapper or reimplementing the tool.<ref>{{cite web|title=A Functional Workflow Language Implementation in Erlang|url=http://www.erlang-factory.com/static/upload/media/1448992381831050cuneiformberlinefl2015.pdf|access-date=28 October 2016}}</ref> Currently supported foreign programming languages are: {{div col}} * [[Bash (Unix shell)|Bash]] * [[Elixir (programming language)|Elixir]] * [[Erlang (programming language)|Erlang]] * [[Java (programming language)|Java]] * [[JavaScript]] * [[MATLAB]] * [[GNU Octave]] * [[Perl]] * [[Python (programming language)|Python]] * [[R (programming language)|R]] * [[Racket (programming language)|Racket]] {{div col end}} Foreign language support for [[AWK]] and [[gnuplot]] are planned additions. ==Type System== Cuneiform provides a simple, statically checked type system.<ref> {{ cite journal | title = Computation semantics of the functional scientific workflow language Cuneiform | last1 = Brandt | first1 = Jörgen | last2 = Reisig | first2 = Wolfgang | last3 = Leser | first3 = Ulf | journal = [[Journal of Functional Programming]] | volume = 27 | year = 2017 | doi = 10.1017/S0956796817000119 | s2cid = 6128299 }} </ref> While Cuneiform provides lists as [[compound data type]]s it omits traditional list accessors (head and tail) to avoid the possibility of runtime errors which might arise when accessing the empty list. Instead lists are accessed in an all-or-nothing fashion by only mapping or folding over them. Additionally, Cuneiform omits (at the organizational level) arithmetics which excludes the possibility of division by zero. The omission of any partially defined operation allows to guarantee that runtime errors can arise exclusively in foreign code. ===Base data types=== As base data types Cuneiform provides Booleans, strings, and files. Herein, files are used to exchange data in arbitrary format between foreign functions. ===Records and pattern matching=== Cuneiform provides [[Record_(computer_science)|record]]s (structs) as compound data types. The example below shows the definition of a variable <code>r</code> being a record with two fields <code>a1</code> and <code>a2</code>, the first being a string and the second being a Boolean. <syntaxhighlight lang="swift"> let r : <a1 : Str, a2 : Bool> = <a1 = "my string", a2 = true>; </syntaxhighlight> Records can be accessed either via projection or via [[pattern matching]]. The example below extracts the two fields <code>a1</code> and <code>a2</code> from the record <code>r</code>. <syntaxhighlight lang="swift"> let a1 : Str = ( r|a1 ); let <a2 = a2 : Bool> = r; </syntaxhighlight> ===Lists and list processing=== Furthermore, Cuneiform provides lists as compound data types. The example below shows the definition of a variable <code>xs</code> being a file list with three elements. <syntaxhighlight lang="erlang"> let xs : [File] = ['a.txt', 'b.txt', 'c.txt' : File]; </syntaxhighlight> Lists can be processed with the for and fold operators. Herein, the for operator can be given multiple lists to consume list element-wise (similar to <code>for/list</code> in [[Racket (programming language)|Racket]], <code>mapcar</code> in [[Common Lisp]] or <code>zipwith</code> in [[Erlang (programming language)|Erlang]]). The example below shows how to map over a single list, the result being a file list. <syntaxhighlight lang="ruby"> for x <- xs do process-one( arg1 = x ) : File end; </syntaxhighlight> The example below shows how to zip two lists the result also being a file list. <syntaxhighlight lang="ruby"> for x <- xs, y <- ys do process-two( arg1 = x, arg2 = y ) : File end; </syntaxhighlight> Finally, lists can be aggregated by using the fold operator. The following example sums up the elements of a list. <syntaxhighlight lang="text"> fold acc = 0, x <- xs do add( a = acc, b = x ) end; </syntaxhighlight> ==Parallel execution== Cuneiform is a purely functional language, i.e., it does not support [[Reference (computer science)|mutable references]]. In the consequence, it can use subterm-independence to divide a program into parallelizable portions. The Cuneiform scheduler distributes these portions to worker nodes. In addition, Cuneiform uses a [[Evaluation strategy#Call by name|Call-by-Name evaluation strategy]] to compute values only if they contribute to the computation result. Finally, foreign function applications are [[Memoization|memoized]] to speed up computations that contain previously derived results. For example, the following Cuneiform program allows the applications of <code>f</code> and <code>g</code> to run in parallel while <code>h</code> is dependent and can be started only when both <code>f</code> and <code>g</code> are finished. {{pre|1= let output-of-f : File = f(); let output-of-g : File = g(); h( f = output-of-f, g = output-of-g ); }} The following Cuneiform program creates three parallel applications of the function <code>f</code> by mapping <code>f</code> over a three-element list: {{pre|1= let xs : [File] = ['a.txt', 'b.txt', 'c.txt' : File]; for x <- xs do f( x = x ) : File end; }} Similarly, the applications of <code>f</code> and <code>g</code> are independent in the construction of the record <code>r</code> and can, thus, be run in parallel: {{sxhl|lang=erlang|1= let r : <a : File, b : File> = <nowiki><a = f(), b = g()></nowiki>; }} ==Examples== A hello-world script: <syntaxhighlight lang="ruby"> def greet( person : Str ) -> <out : Str> in Bash *{ out="Hello $person" }* ( greet( person = "world" )|out ); </syntaxhighlight> This script defines a task <code>greet</code> in [[Bash (Unix shell)|Bash]] which prepends <code>"Hello "</code> to its string argument <code>person</code>. The function produces a record with a single string field <code>out</code>. Applying <code>greet</code>, binding the argument <code>person</code> to the string <code>"world"</code> produces the record <code><out = "Hello world"></code>. Projecting this record to its field <code>out</code> evaluates the string <code>"Hello world"</code>. Command line tools can be integrated by defining a task in [[Bash (Unix shell)|Bash]]: <syntaxhighlight lang="ruby"> def samtoolsSort( bam : File ) -> <sorted : File> in Bash *{ sorted=sorted.bam samtools sort -m 2G $bam -o $sorted }* </syntaxhighlight> In this example a task <code>samtoolsSort</code> is defined. It calls the tool [[SAMtools]], consuming an input file, in BAM format, and producing a sorted output file, also in BAM format. ==Release history== {| class="wikitable" |- ! Version !! Appearance !! Implementation Language !! Distribution Platform !! Foreign Languages |- ! 1.0.0 | May 2014 | [[Java (programming language)|Java]] | [[Apache Hadoop]] | Bash, Common Lisp, GNU Octave, Perl, Python, R, Scala |- ! 2.0.x | Mar. 2015 | [[Java (programming language)|Java]] | [[HTCondor]], [[Apache Hadoop]] | Bash, BeanShell, Common Lisp, MATLAB, GNU Octave, Perl, Python, R, Scala |- ! 2.2.x | Apr. 2016 | [[Erlang (programming language)|Erlang]] | [[HTCondor]], [[Apache Hadoop]] | Bash, Perl, Python, R |- ! 3.0.x | Feb. 2018 | [[Erlang (programming language)|Erlang]] | Distributed Erlang | Bash, Erlang, Java, MATLAB, GNU Octave, Perl, Python, R, Racket |} In April 2016, Cuneiform's implementation language switched from [[Java (programming language)|Java]] to [[Erlang (programming language)|Erlang]] and, in February 2018, its major distributed execution platform changed from a Hadoop to distributed Erlang. Additionally, from 2015 to 2018 [[HTCondor]] had been maintained as an alternative execution platform. Cuneiform's surface syntax was revised twice, as reflected in the major version number. ===Version 1=== In its first draft published in May 2014, Cuneiform was closely related to [[Make (software)|Make]] in that it constructed a static data dependency graph which the interpreter traversed during execution. The major difference to later versions was the lack of conditionals, recursion, or static type checking. Files were distinguished from strings by juxtaposing single-quoted string values with a tilde <code>~</code>. The script's query expression was introduced with the <code>target</code> keyword. Bash was the default foreign language. Function application had to be performed using an <code>apply</code> form that took <code>task</code> as its first keyword argument. One year later, this surface syntax was replaced by a streamlined but similar version. The following example script downloads a reference genome from an FTP server. <pre> declare download-ref-genome; deftask download-fa( fa : ~path ~id ) *{ wget $path/$id.fa.gz gunzip $id.fa.gz mv $id.fa $fa }* ref-genome-path = ~'ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes'; ref-genome-id = ~'chr22'; ref-genome = apply( task : download-fa path : ref-genome-path id : ref-genome-id ); target ref-genome; </pre> ===Version 2=== [[File:Cf screenshot.jpg|thumb|Swing-based editor and REPL for Cuneiform 2.0.3]] The second draft of the Cuneiform surface syntax, first published in March 2015, remained in use for three years outlasting the transition from Java to Erlang as Cuneiform's implementation language. Evaluation differs from earlier approaches in that the interpreter reduces a query expression instead of traversing a static graph. During the time the surface syntax remained in use the interpreter was formalized and simplified which resulted in a first specification of Cuneiform's semantics. The syntax featured conditionals. However, Booleans were encoded as lists, recycling the empty list as Boolean false and the non-empty list as Boolean true. Recursion was added later as a byproduct of formalization. However, static type checking was introduced only in Version 3. The following script decompresses a zipped file and splits it into evenly sized partitions. <pre> deftask unzip( <out( File )> : zip( File ) ) in bash *{ unzip -d dir $zip out=`ls dir | awk '{print "dir/" $0}'` }* deftask split( <out( File )> : file( File ) ) in bash *{ split -l 1024 $file txt out=txt* }* sotu = "sotu/stateoftheunion1790-2014.txt.zip"; fileLst = split( file: unzip( zip: sotu ) ); fileLst; </pre> ===Version 3=== The current version of Cuneiform's surface syntax, in comparison to earlier drafts, is an attempt to close the gap to mainstream functional programming languages. It features a simple, statically checked typesystem and introduces records in addition to lists as a second type of compound data structure. Booleans are a separate base data type. The following script untars a file resulting in a file list. <pre> def untar( tar : File ) -> <fileLst : [File]> in Bash *{ tar xf $tar fileLst=`tar tf $tar` }* let hg38Tar : File = 'hg38/hg38.tar'; let <fileLst = faLst : [File]> = untar( tar = hg38Tar ); faLst; </pre> ==References== {{Reflist|30em}} [[Category:Programming languages]] [[Category:Workflow languages]] [[Category:Functional languages]] [[Category:Scripting languages]] [[Category:Linux programming tools]] [[Category:Hadoop]] [[Category:Statically typed programming languages]] [[Category:Cross-platform free software]]'
Unified diff of changes made by edit (edit_diff)
'@@ -255,5 +255,5 @@ {{sxhl|lang=erlang|1= let r : <a : File, b : File> = - <a = f(), b = g()>; + <nowiki><a = f(), b = g()></nowiki>; }} '
New page size (new_size)
19211
Old page size (old_size)
19194
Size change in edit (edit_delta)
17
Lines added in edit (added_lines)
[ 0 => ' <nowiki><a = f(), b = g()></nowiki>;' ]
Lines removed in edit (removed_lines)
[ 0 => ' <a = f(), b = g()>;' ]
Whether or not the change was made through a Tor exit node (tor_exit_node)
false
Unix timestamp of change (timestamp)
'1682731201'