Migrating from Julia 0.4 to Julia 1.6
This process references release v98 (March, 2022).
This process requires that you have access to a DSWB container capable of running Julia 1.6 notebooks.
Starting notes
-
The DSWB module pulls in and re-exports the most commonly required submodules:
- BeaconAnalysis, DSWBSessionAnalysis, DSWBErrorAnalysis, SessionSummarizer, ResourceAnalysis
- DBConstants, DSWBUtilities, DSWBConnecting, mPulseAPI
- CommonSQL, SQLFilters, TableDefinitions, mPulseConstants
- DBI, DBITypes, DataFrames
- C3Viz, DataFrameUtilities
- CSV, FileIO
- DistributionStats, DataStructures (symbols must be qualified), Statistics, StatsBase, StatsFuns
- Dates, TimeZones
- Formatting, Logging, UUIDs
- HTTP (symbols must be qualified), JSON, CurlHelper, LightXML
- IJulia
- Statistics, StatsBase, StatsFuns
-
You can use the Pkg module to find details about available modules:
using Pkg
Pkg.status()
- Use the
VERSIONvariable to make sure you're using julia 1.6 - These instructions are subject to change in future releases (see the date at the start of the doc).
- Julia 1.6 functions are now fully documented at the regular documentation link.
- We no longer have a
DSWBChartingmodule, all charting functions are now defined alongside the matching get* function. - We no longer provide the
IJuliaCompatmodule (which gets youreadprompt). Addusing IJuliafollowed byIJulia.readpromptto get the same functionality.
Notable changes in v98.
This is a short list of the most notable changes between v97 and v98.
- Add Julia 1.6 support
- Add support for LISTAGG to SessionSummarizer
- Add percentile support to Resource Treemap
- Allow getTopDimensionsBySession to work with dimension expressions as well
- SessionSummarizer more aggressively tries to avoid Null references and divide by zero errors
- Handle numeric database values larger than typemax(Int64)
- New
:COUNT_DISTINCTaggregate parameter - Add Table constants for the Error Table (ERR_TABLE)
- Add color domain and range support to treemaps
Syntax differences
typeis now calledstructand isimmutableby default, usemutable structto make it mutable.containsandismatchare replaced byoccursin.ucfirstandlcfirstare replaced byuppercasefirstandlowercasefirst.replacetakes 2 parameters instead of 3. The second parameter is now aPairof what used to be the 2nd and 3rd parameter, so changereplace(str, pat, rep)toreplace(str, pat => rep). To replacenoccurrences, use thecount=nkeyword argument.find*now returnsnothinginstead of0if nothing is found. This can be problematic if you then useif findindex > 0, instead useif findindex != nothing.fieldnamesno longer works on objects, its argument MUST be a data type, so usefieldnames(typeof(obj))instead offieldnames(obj).- The
obj.(sym)syntax doesn't appear to work to dereference an object property using a symbol, but you can still usegetfield(obj, sym). - The datatype
Voidis now calledNothing. - Template functions and types were previously written as
function xyz {T <: Foo}(x::T)should now be written as functionxyz(x::T) where {T <: Foo}. - You MUST
importa function from another module if you want to add to its methods in your module. - Logging functions are now macros, so use
@warn,@info, etc. instead ofwarn,info. Additionally, theerrlogging function is now the@errormacro. - To test warnings, debug, info, etc., use
@test_logs (:warn, "Warn message") <expression>. reduce,mapreduce,mapfoldl, andmapfoldrnow take the initial array state as a keyword argument calledinit.- The
sortandsort!methods on dataframes takecolsas the second argument and no longer as a keyword argument, so changesort(df, cols=[x, y, z])tosort(df, [x, y, z]). - Change
eltypes(df)toeltype.(eachcol(df)). - Dataframes can no longer be addressed only by column. Specify
!as the row to get all rows for a column:df[!, colname], or alternately, usedf.colnameif colname is a valid julia identifier. - Dataframe columns may be addressed as symbols, strings, or indices.
- The
namesfunction applied to a dataframe will return column names as strings, so when searching for a column name this way, use a string rather than a Symbol. shift!is now calledpopfirst!andunshift!is calledpushfirst!.- Database
NULLsare no longer stored asNA. Julia 1.0 usesmissingto representNULLs. All code usingisnaneeds to change toismissing, additionally,ismissingdoes not operate on arrays, so the function needs to be broadcast usingismissing.(). round()no longer works on arrays, use the broadcast operatorround.()instead.- To
round()to a number of decimal places, specify the number with the keyword argumentdigits. - In
filter(f, ::Dict),fis now passed a single pair instead of two arguments. - The second argument to the
invokefunction used to be atuple()object but is now aTuple{}type. tic()andtoq()have been removed. Instead usetime_ns()for nanoseconds andtime()for second resolution.names!(df, [col names])is deprecated, userename!instead.isreadableno longer works on filenames, useisfileinstead.- You cannot use
jointo joinDataFrames, instead use one ofsemijoin,leftjoin,antijoin,rightjoin,outerjoin,innerjoin,crossjoin. findinno longer exists, useindexininstead. Note thatindexinworks differently, and returns an array containingnothingfor items that don't exist.- The
bymethod forDataFrameshas been replaced by a combination ofcombineandgroupby. - Binary operators that previously worked on arrays may need to be broadcast. For example, the
&and|operators onBitArraysnow need to be.&and.|.
Similarly when assigning a scalar to an array, use the.=operator rather than=. indminandindmaxno longer exist, usefindminandfindmaxinstead. These return a tuple, and the index is the second element.STDIN,STDOUT, andSTDERRare nowstdin,stdout, andstderr, so instead of callingflush(STDOUT), useflush(stdout)instead.- The datatype of DataFrame columns returned by any of our database calls will be
Union{Missing, <data type>}. In order to check what<data type>is, use thenonmissingtypefunction:
if nonmissingtype(df.column) == AbstractString. mapis no longer defined onDicts. Usecollectto transform aDictinto aVectorofPairsand either usemap, or a list comprehension on the result.- The
keepkeyword argument ofsplitis now calledkeepempty. readallhas been replaced withread. Additionally, to read a file as a string, passStringas the second argument:read("filename", String).- When searching for characters in a String using
findfirst,findnext,findprev, andfindlast, the arguments are swapped. The needle (Char) comes first, and haystack (String) is second. - The
TimeZonemodule no longer supports deprecated timezones by default, so for exampleTimeZone("Asia/Calcutta")will throw an exception. UseTimeZone("Asia/Calcutta",TimeZones.Class(:ALL))to allow deprecated timezone names. writetableno longer works to save aDataFrameobject to acsvfile. UseCSV.write("sample.csv", sample_df)instead.
Other notes on usage
Note that some coding styles were deprecated in earlier versions with 0.4 and these do not exist in 1.6. For example:
- Instead of using the string "BEACON_TABLE", use the constant BEACONS_TABLE. This should be done in both 0.4 and 1.6
- Instead of using this construct:
if !isa(QueryAPI.gLatestCon, DBHandles.NullHandle)
QueryAPI.gLatestCon.conn.disconnect!()
end
Call disconnect!() instead.
padDateTimenow has to be qualified asC3Viz.padDateTime.- SQL generators that used to be in
DBUtilities(eg:DBUtilities.isCached) are now in CommonSQL. - The
replaceNA!andreplaceNaN!functions aren't strictly required any more as Julia now has a built inreplace!method forDataFramecolumns. Any of the following styles will work:replace!(df.numeric_colname, missing => 0.0)replace!(df[!, :numeric_colname], NaN => 0.0)replace!(df[!, "string colname"], missing => "")- You can even pass in multiple replacements for the same column in one call:
replace!(df.numeric_colname, missing => 0.0, NaN => 0.0)
replaceNA!has been changed toreplacemissing!The main advantage ofreplacemissing!overreplace!is that it can do data type promotion of columns. This is useful, for example, if you have a column that has only missing values. In that case the column's type isMissingand won't accept a value of a different type.replacemissing!changes the type to allow for a composite value.- By default many functions will print out Info messages showing query execution time. To suppress these messages, add the following to the top of your notebook:
Logging.disable_logging(Logging.Info) - When referencing all rows in a dataframe column with a name that does not need to be quoted (for example, green_column), you can use either
df[!,:green_column]ORdf.green_column.
Supported DSWB Modules/Functions
DSWB
BeaconAnalysis
- getAggregateMetricByDimension
- getBeaconCount
- getBeaconsFirstAndLast
- getGroupPercentages
- getTopLandingPages
- getTopReferrers
- chartTopN
- chartLoadTimes
- chartBeaconTreemap
DSWBSessionAnalysis
- basic
- getMedianSessionDuration,
- getMedianSessionLength,
- getSessionDistributionByDay,
- getSessionDistributionByDayAndTopDimensions,
- getSessionDurationQuantilesByDatepart,
- chartSessionDurationQuantilesByDatepart
- getTopPageGroupsByExitRate,
- getTopDimensionsBySession,
- getWeightedSessionDimensions,
- getConcurrentSessionsAndBeaconsByDatepart,
- chartConcurrentSessionsAndBeaconsOverTime
- bounce related
- getBounceRate,
- getBounceRateByLandingPage,
- chartBounceRateByDimension,
- getBounceRateByDatepart,
- chartBounceRateByDatepart,
- getBounceRateByLoadTime,
- chartBouncesVsLoadTimes,
- getLoadTimeAggregateAndBounceRates,
- chartLoadTimeMediansAndBounceRatesByPageGroup,
- getRetentionRateByDimension
- conversion related
- getConversionRate,
- chartConversionRateByDimension,
- getConversionRatesByTimerAvg,
- chartConversionsVsLoadTimes,
- getPeakConversionRate,
- getPeakConversions,
- getExternalReferrersConversionsAndLoadTimes,
- chartExternalReferrerSummary,
- getPageGroupsAndSessionsWithConversionImpact,
- getTopGroupsByConversionImpact,
- chartConversionImpactByPageGroup,
- getTopGroupsByActivityImpact,
- chartActivityImpactByPageGroup
SessionSummarizer
- getSessionDistribution,
- getActiveSessions,
- getSessionSummary
DSWBErrorAnalysis
- getNetworkErrorRateByDimension
- getErrorCount
- getAvgErrorCount
- ErrorMessageCluster
- getNormalizedErrorMessage
- getClusterNoisePts
- addNewErrorsToGroups!
- getRemainingPagesDistribution
ResourceAnalysis
- getResources
- chartResources
- getResourceTrend
- chartResourceTrend
- getTreemapResources
- chartTreemapResources
- getLoadTimeStats
- chartLoadTimeStats
- getResourceServerStats
- chartResourceServerStats
- getRISResources
- getRISVendors
- getLoadTimeDistribution
- chartLoadTimeDistribution
- getResourceCount
DSWBUtilities
- datetime2ms,
- ms2datetime,
- NormalizedTimeType,
- timestampToZonedDateTime,
- DATE_FORMATS,
- DATE_FORMATS_TZ,
- getDateRange,
- getArgs,
- getArgTypes,
- getKWArgs,
- @soasta_time,
- isdatepart,
- extractDatepart,
- splitDatepart,
- isnan_proxy,
- mapGeoColumns,
- parseBucketParameters,
- plugTimeHoles!,
- sendToSlack (new support for rich blocks),
- sendmail (replaces sendSOASTAEmail),
- decodeBoomerangCompressedLog,
- decodeBoomerangCompressedTimeline
DSWBConnecting
- getConnection,
- getTenantAndAppName,
- setConversionMetric,
- setEndpoint,
- setEndpointAndTable,
- setEndpointForTenantId,
- setEndpointFromMPulse,
- setSnowflakeEndpoint,
- setTable,
- setTableForDomainId,
- disconnect!
C3Viz
- drawC3Viz,
- drawBoxPlots,
- drawTree,
- displayTitle,
- C3Viz.padDateTime
mPulseConstants
- mPulseConstants.mP2DSWB
- mPulseConstants.getFriendlyTimerName
- mPulseConstants.getTimerColor
DBConstants
- DBConstants.getPARAM()
- DBConstants.setPARAM!()
- DBConstants.getAD_META()
- DBConstants.setAD_META!()
- TIMER_TYPE,
- DATEPART_DISPLAY,
- DATEPART_TO_JULIA,
- BASE_SUPPORTED_DATEPARTS,
- BASE_SUPPORTED_WINDOW_FUNCTIONS,
- BASE_EXTENDED_WINDOW_FUNCTIONS,
- BASE_SUPPORTED_AGGREGATE_FUNCTIONS,
- UnsupportedWindowFunction,
- BEACON_TYPE_PAGE_VIEWS,
- BEACON_TYPE_XHR,
- BEACON_TYPE_SPA,
- BEACON_TYPE_API,
- BEACON_TYPE_ERROR,
- BEACON_TYPE_USER_INTERACTIONS,
- BEACON_TABLE_PREFIX_SF,
- BEACON_TABLE_NAME_ASGARD,
- RT_TABLE_NAME_ASGARD,
- SESSION_TABLE_NAME_ASGARD,
- ERROR_TABLE_NAME_ASGARD,
- DSN_PREFIX,
- PARAMS,
- DSN,
- CONNECTION,
- DOMAIN_OBJ,
- TENANT_OBJ,
- BEACONS_TABLE,
- RESOURCE_TABLE,
- SESSIONS_TABLE,
- ERROR_TABLE,
- APPID,
- CUSTOM_METRICS,
- CONV_METRIC,
- SESSION_TIMEOUT,
- MPULSE_AUTH_TOKEN,
- MPULSE_TIMER_BUCKET_DEFAULT_EDGES,
- REQUEST_HASH,
- MP_OBJECT_ENDPOINT,
- DSWB_SESSION_ID,
- AD_META,
- SOASTALIB,
- TEMPLATE_DIR,
- COLORS,
- BASE64_CHARS
DBI
- DBI.Connection
- execute
- select
- disconnect!
- isClosed
- cursor
- commit
- rollback
- setmeta
- printDSN
- listDSNs
- ODBC_SECTIONS
- setVerbose
DataFrameUtilities
- beautifyDF
- injectDefaultTableStyling
- replacemissing
- replacemissing!
- replaceNaN
- replaceNaN!
DistributionStats
- percentile
- median
- gstddev
- c df_to_pdf
- pdf_to_cdf
CommonSQL
- truncateSQLTimestamp,
- assetTypeGuessSQL,
- contentTypeGuessSQL,
- isCached
- CommonSQL.filter2Snowflake
- CommonSQL.filter2Asgard
- CommonSQL.filter2String
- translateFilters,
- CommonSQL.applyAggregate
- CommonSQL.parse_url
- CommonSQL.IF
- CommonSQL.SQL_ARRAY
- CommonSQL.ARRAY_JOIN
- CommonSQL.REGEXP_SUBSTR/REGEXP_EXTRACT
- CommonSQL.MEDIAN
- CommonSQL.PERCENTILE
- CommonSQL.FIRST_VALUE
- CommonSQL.LAST_VALUE
- CommonSQL.LISTAGG
- CommonSQL.extractGroupDimensions
- CommonSQL.generateAggregateExpressions
- fixTableName,
- RT_HOST_EXPRESSION,
- CommonSQL.hostToDomain
- CommonSQL.hasTao
SQLFilters
- SQLFilter,
- SQLFilterUnion,
- combineFilters,
- escapeSingleQuote,
- unescapeSingleQuote,
- isNull,
- isNotNull,
- like,
- notlike,
- ilike,
- notilike,
- rlike,
- notrlike,
- eq,
- noteq,
- leq,
- geq,
- lt,
- gt,
- between,
- notin,
- isin
AggregateWaterfall
- getAggregateWaterfallDistribution
- chartAggregateWaterfall
WhatIf
- WhatIf.getWhatIfAnalysis
WhatChanged
- WhatChanged.whatElseChanged
AnomalyDetection
- AnomalyDetection.chartAlerts
- AnomalyDetection.chartAnomalies
- AnomalyDetection.chartDynamicAlert
IQR
KNNICAD
NAB
DataStories
- DataStories.getResourceImpactData
Updated 3 months ago
